Publications

PoisonedParrot: Subtle Data Poisoning Attacks to Elicit Copyright-Infringing Content from Large Language Models

The 2025 Annual Conference of the Nations of the Americas Chapter of the ACL (NAACL), 2025.
Panaitescu-Liess, Michael-Andrei, Pankayaraj Pathmanathan, Yigitcan Kaya, Zora Che, Bang An, Sicheng Zhu, Aakriti Agrawal, Furong Huang.
BibTeX

Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey

The 2025 Annual Conference of the Nations of the Americas Chapter of the ACL (NAACL), 2025.
Liu, Xiaoyu, Paiheng Xu, Junda Wu, Jiaxin Yuan, Yifan Yang, Yuhang Zhou, Fuxiao Liu, Tianrui Guan, Haoliang Wang, Tong Yu, Julian McAuley, Wei Ai, Furong Huang.
BibTeX

Statistical Guarantees for Lifelong Reinforcement Learning using PAC-Bayesian Theory

The 28th International Conference on Artificial Intelligence and Statistics (AISTATS), 2025.
Zhang, Zhi, Chris Chow, Yasi Zhang, Yanchao Sun, Haochen Zhang, Eric Hanchen Jiang, Han Liu, Furong Huang, Yuchen Cui, and Oscar Hernan Madrid Padilla.
BibTeX

GFairHint: Improving Individual Fairness for Graph Neural Networks via Fairness Hint

ACM Transactions on Knowledge Discovery from Data (TKDD), 2025.
Xu, Paiheng, Yuhang Zhou, Bang An, Wei Ai, and Furong Huang.
BibTeX

Safety Guaranteed Robust Multi-Agent Reinforcement Learning with Hierarchical Control for Connected and Automated Vehicles

IEEE International Conference on Robotics and Automation (ICRA), 2025.
Zhang, Zhili, H M Sabbir Ahmad, Ehsan Sabouni, Yanchao Sun, Furong Huang, Wenchao Li, and Fei Miao.
BibTeX

Is poisoning a real threat to DPO? Maybe more so than you think

In AAAI 2025 AI Alignment Track (AAAI), 2025, 2025
Pankayaraj Pathmanathan and Souradip Chakraborty and Xiangyu Liu and Yongyuan Liang and Furong Huang
Publisher's website BibTeX

Is poisoning a real threat to DPO? Maybe more so than you think.

AAAI 2025 AI Alignment Track (AAAI), 2025.
Pathmanathan, Pankayaraj, Souradip Chakraborty, Xiangyu Liu, Yongyuan Liang, and Furong Huang.
BibTeX

Is poisoning a real threat to LLM alignment? Maybe more so than you think

In ICML 2024 Workshop on Models of Human Feedback for AI Alignment, ICML 2024, 2024
Pankayaraj Pathmanathan and Souradip Chakraborty and Xiangyu Liu and Yongyuan Liang and Furong Huang
BibTeX

Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data?

The 39th Annual AAAI Conference on Artificial Intelligence (AAAI), 2025.
Panaitescu-Liess, Michael-Andrei, Zora Che, Bang An, Yuancheng Xu, Pankayaraj Path- manathan, Souradip Chakraborty, Sicheng Zhu, Tom Goldstein, and Furong Huang.
BibTeX

Easy2Hard-Bench: Standardized Difficulty Labels for Profiling LLM Performance and Generalization

The Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track, 2024.
Ding, Mucong, Chenghao Deng, Jocelyn Choo, Zichu Wu, Aakriti Agrawal, Avi Schwarzschild, Tianyi Zhou, Tom Goldstein, John Langford, Anima Anandkumar, and Furong Huang.
Publisher's website BibTeX