PropensityBench: Evaluating Latent Safety Risks in Large Language Models via an Agentic Approach

Year

2026

Type(s)

Conference articles

Author(s)

Udari Madhushani Sehwag and Shayan Shabihi and Alex McAvoy and Vikash Sehwag and Yuancheng Xu and Dalton Towers and Furong Huang

Source

In The Fourteenth International Conference on Learning Representations (ICLR), 2026, 2026

Url

BibTeX

Furong Huang