DISCO Balances the Scales: Adaptive Domain- and Difficulty-Aware Reinforcement Learning on Imbalanced Data

Year
2025
Type(s)
Author(s)
Yuhang Zhou and Jing Zhu and Shengyi Qian and Zhuokai Zhao and Xiyao Wang and Xiaoyu Liu and Ming Li and Paiheng Xu and Wei Ai and Furong Huang
Source
In The 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025, 2025
Url
https://arxiv.org/abs/2505.15074
BibTeX
BibTeX