GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-Time Alignment

Year
2025
Type(s)
Author(s)
Yuancheng Xu and Udari Madhushani Sehwag and Alec Koppel and Sicheng Zhu and Bang An and Furong Huang and Sumitra Ganesh
Source
In The Thirteenth International Conference on Learning Representations (ICLR), 2025, 2025
Url
https://arxiv.org/abs/2410.08193
BibTeX
BibTeX