GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-Time Alignment

Year
2025
Type(s)
Author(s)
Xu, Yuancheng, Udari Madhushani Sehwag, Alec Koppel, Sicheng Zhu, Bang An, Furong Huang, and Sumitra Ganesh.
Source
The Thirteenth International Conference on Learning Representations (ICLR), 2025.
BibTeX
BibTeX