Easy2Hard-Bench: Standardized Difficulty Labels for Profiling LLM Performance and Generalization

Year
2024
Type(s)
Author(s)
Ding, Mucong, Chenghao Deng, Jocelyn Choo, Zichu Wu, Aakriti Agrawal, Avi Schwarzschild, Tianyi Zhou, Tom Goldstein, John Langford, Anima Anandkumar, and Furong Huang.
Source
The Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track, 2024.
Url
https://arxiv.org/abs/2409.18433
BibTeX
BibTeX