Principal-Driven Reward De- sign and Agent Policy Alignment via Bilevel-RL

Year
2023
Type(s)
Author(s)
Souradip Chakraborty, Amrit Singh Bedi, Alec Koppel, Furong Huang, Mengdi Wang
Source
Interactive Learning with Implicit Human Feedback Workshop (ILHF), ICML 2023.
BibTeX
BibTeX