Publications

Position: TrustLLM: Trustworthiness in Large Language Models

In Proceedings of the 41st International Conference on Machine Learning (ICML), 2024, 2024
Yue Huang and Lichao Sun and Haoran Wang and Siyuan Wu and Qihui Zhang and Yuan Li and Chujie Gao and Yixin Huang and Wenhan Lyu and Yixuan Zhang and Xiner Li and Hanchi Sun and Zhengliang Liu and Yixin Liu and Yijue Wang and Zhikun Zhang and Bertie Vidgen and Bhavya Kailkhura and Caiming Xiong and Chaowei Xiao and Chunyuan Li and Eric P. Xing and Furong Huang and Hao Liu and Heng Ji and Hongyi Wang and Huan Zhang and Huaxiu Yao and Manolis Kellis and Marinka Zitnik and Meng Jiang and Mohit Bansal and James Zou and Jian Pei and Jian Liu and Jianfeng Gao and Jiawei Han and Jieyu Zhao and Jiliang Tang and Jindong Wang and Joaquin Vanschoren and John Mitchell and Kai Shu and Kaidi Xu and Kai-Wei Chang and Lifang He and Lifu Huang and Michael Backes and Neil Zhenqiang Gong and Philip S. Yu and Pin-Yu Chen and Quanquan Gu and Ran Xu and Rex Ying and Shuiwang Ji and Suman Jana and Tianlong Chen and Tianming Liu and Tianyi Zhou and William Yang Wang and Xiang Li and Xiangliang Zhang and Xiao Wang and Xing Xie and Xun Chen and Xuyu Wang and Yan Liu and Yanfang Ye and Yinzhi Cao and Yong Chen and Yue Zhao
Publisher's website BibTeX

Position Paper: TrustLLM: Trustworthiness in Large Language Models

Proceedings of the 41st International Conference on Machine Learning (ICML), 2024.
Huang,Yue, Lichao Sun, Haoran Wang, Siyuan Wu, Qihui Zhang, Yuan Li, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, Xiner Li, Hanchi Sun, Zhengliang Liu, Yixin Liu, Yijue Wang, Zhikun Zhang, Bertie Vidgen, Bhavya Kailkhura, Caiming Xiong, Chaowei Xiao, Chunyuan Li, Eric P. Xing, Furong Huang, Hao Liu, Heng Ji, Hongyi Wang, Huan Zhang, Huaxiu Yao, Manolis Kellis, Marinka Zitnik, Meng Jiang, Mohit Bansal, James Zou, Jian Pei, Jian Liu, Jianfeng Gao, Jiawei Han, Jieyu Zhao, Jiliang Tang, Jindong Wang, Joaquin Vanschoren, John Mitchell, Kai Shu, Kaidi Xu, Kai-Wei Chang, Lifang He, Lifu Huang, Michael Backes, Neil Zhenqiang Gong, Philip S. Yu, Pin-Yu Chen, Quanquan Gu, Ran Xu, Rex Ying, Shuiwang Ji, Suman Jana, Tianlong Chen, Tianming Liu, Tianyi Zhou, William Yang Wang, Xiang Li, Xiangliang Zhang, Xiao Wang, Xing Xie, Xun Chen, Xuyu Wang, Yan Liu, Yanfang Ye, Yinzhi Cao, Yong Chen, Yue Zhao.
BibTeX

WAVES: Benchmarking the Robustness of Image Watermarks

Proceedings of the 41st International Conference on Machine Learning (ICML), 2024.
An, Bang, Mucong Ding, Tahseen Rabbani, Aakriti Agrawal, Yuancheng Xu, Chenghao Deng, Sicheng Zhu, Abdirisak Mohamed, Yuxin Wen, Tom Goldstein, Furong Huang.
BibTeX

Adapting Static Fairness to Sequential Decision-Making: Bias Mitigation Strategies towards Equal Long-term Benefit Rate

Proceedings of the 41st International Conference on Machine Learning (ICML), 2024.
Xu, Yuancheng, Chenghao Deng, Yanchao Sun, Ruijie Zheng, Xiyao Wang, Jieyu Zhao, Furong Huang.
BibTeX

Equal Long-term Benefit Rate: Adapting Static Fairness Notions to Sequential Decision Making

AdvML-Frontiers workshop, ICML 2023.
Yuancheng Xu, Chenghao Deng, Yanchao Sun, Ruijie Zheng, Xiyao Wang, Jieyu Zhao, Furong Huang
BibTeX

PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in Control

Proceedings of the 41st International Conference on Machine Learning (ICML), 2024.
Zheng, Ruijie, Ching-An Cheng, Hal Daum ́e III, Furong Huang, Andrey Kolobov.
BibTeX

Premier-TACO is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss

Proceedings of the 41st International Conference on Machine Learning (ICML), 2024.
Zheng, Ruijie, Yongyuan Liang, Xiyao Wang, Shuang Ma, Hal Daum ́e III, Huazhe Xu, John Langford, Praveen Palanisamy, Kalyan Shankar Basu, Furong Huang.
BibTeX

ACE: Off-Policy Actor-Critic with Causality-Aware Entropy Regularization

Proceedings of the 41st International Conference on Machine Learning (ICML), 2024.
Ji, Tianying, Yongyuan Liang, Yan Zeng, Yu Luo, Guowei Xu, Jiawei Guo, Ruijie Zheng, Furong Huang, Fuchun Sun, Huazhe Xu.
BibTeX

MaxMin-RLHF: Alignment with Diverse Human Preferences

Proceedings of the 41st International Conference on Machine Learning (ICML), 2024.
Chakraborty, Souradip, Jiahao Qiu, Hui Yuan, Alec Koppel, Furong Huang, Dinesh Manocha, Amrit Bedi, Mengdi Wang.
BibTeX

MaxMin-RLHF: Towards Equitable Alignment of Large Language Models with Diverse Human Preferences

In Oral, ICML 2024 Workshop on Models of Human Feedback for AI Alignment, ICML 2024, 2024
Souradip Chakraborty and Jiahao Qiu and Hui Yuan and Alec Koppel and Furong Huang and Dinesh Manocha and Amrit Bedi and Mengdi Wang
BibTeX