AdvBDGen: Adversarially Fortified Prompt-Specific Fuzzy Backdoor Generator Against LLM Alignment
In The AAAI-25 Workshop on Artificial Intelligence for Cyber Security, AICS 2025, 2025
Pankayaraj Pathmanathan and Udari Madhushani Sehwag and Michael-Andrei Panaitescu-Liess and Furong Huang
Publisher's website
Pankayaraj Pathmanathan and Udari Madhushani Sehwag and Michael-Andrei Panaitescu-Liess and Furong Huang
Jailbreaks as Inference-Time Alignment: A Framework for Understanding Safety Failures in LLMs
In 19th Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2026, 2026
James Beetham and Souradip Chakraborty and Mengdi Wang and Furong Huang and Amrit Singh Bedi and Mubarak Shah
Publisher's website
James Beetham and Souradip Chakraborty and Mengdi Wang and Furong Huang and Amrit Singh Bedi and Mubarak Shah
SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement
In The Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS), Spotlight, 2025, 2025
Xiyao Wang and Zhengyuan Yang and Chao Feng and Hongjin Lu and Linjie Li and Chung-Ching Lin and Kevin Lin and Furong Huang and Lijuan Wang
Publisher's website
Xiyao Wang and Zhengyuan Yang and Chao Feng and Hongjin Lu and Linjie Li and Chung-Ching Lin and Kevin Lin and Furong Huang and Lijuan Wang
ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs
In The Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS), 2025, 2025
Xiyao Wang and Zhengyuan Yang and Chao Feng and Yuhang Zhou and Xiaoyu Liu and Yongyuan Liang and Ming Li and Ziyi Zang and Linjie Li and Chung-Ching Lin and Kevin Lin and Furong Huang and Lijuan Wang
Publisher's website
Xiyao Wang and Zhengyuan Yang and Chao Feng and Yuhang Zhou and Xiaoyu Liu and Yongyuan Liang and Ming Li and Ziyi Zang and Linjie Li and Chung-Ching Lin and Kevin Lin and Furong Huang and Lijuan Wang
Does Thinking More Always Help? Mirage of Test-Time Scaling in Reasoning Models
In The Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS), 2025, 2025
Soumya Suvra Ghosal and Souradip Chakraborty and Avinash Reddy and Yifu Lu and Mengdi Wang and Dinesh Manocha and Furong Huang and Mohammad Ghavamzadeh and Amrit Singh Bedi
Publisher's website
Soumya Suvra Ghosal and Souradip Chakraborty and Avinash Reddy and Yifu Lu and Mengdi Wang and Dinesh Manocha and Furong Huang and Mohammad Ghavamzadeh and Amrit Singh Bedi
A Technical Report on “Erasing the Invisible”: The 2024 NeurIPS Competition on Stress Testing Image Watermarks
In The Thirty-ninth Annual Conference on Neural Information Processing Systems Datasets and Benchmarks Track (NeurIPS), 2025, 2025
Mucong Ding and Bang An and Tahseen Rabbani and Chenghao Deng and Anirudh Satheesh and Souradip Chakraborty and Mehrdad Saberi and Yuxin Wen and Kyle Rui Sang and Aakriti Agrawal and Xuandong Zhao and Mo Zhou and Mary-Anne Hartley and Lei Li and Yu-Xiang Wang and Vishal M. Patel and Soheil Feizi and Tom Goldstein and Furong Huang
Publisher's website
Mucong Ding and Bang An and Tahseen Rabbani and Chenghao Deng and Anirudh Satheesh and Souradip Chakraborty and Mehrdad Saberi and Yuxin Wen and Kyle Rui Sang and Aakriti Agrawal and Xuandong Zhao and Mo Zhou and Mary-Anne Hartley and Lei Li and Yu-Xiang Wang and Vishal M. Patel and Soheil Feizi and Tom Goldstein and Furong Huang
Practical Memorization Tests for Detecting Copyrighted Data in Large Language Models
In Seventh Workshop on Privacy in Natural Language Processing (PrivateNLP), ACL 2026, 2026
Michael-Andrei Panaitescu-Liess and Aadi Palnitkar and Archit Kambhamettu and Yigitcan Kaya and Daniel Brown and Sungbin Oh and Sean Michael McLeish and Marco Bornstein and Furong Huang and Tom Goldstein
Michael-Andrei Panaitescu-Liess and Aadi Palnitkar and Archit Kambhamettu and Yigitcan Kaya and Daniel Brown and Sungbin Oh and Sean Michael McLeish and Marco Bornstein and Furong Huang and Tom Goldstein
Uncertainty-Aware Answer Selection for Improved Reasoning in Multi-LLM Systems
In The 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025, 2025
Aakriti Agrawal and Rohith Aralikatti and Anirudh Satheesh and Souradip Chakraborty and Amrit Singh Bedi and Furong Huang
Publisher's website
Aakriti Agrawal and Rohith Aralikatti and Anirudh Satheesh and Souradip Chakraborty and Amrit Singh Bedi and Furong Huang
DISCO Balances the Scales: Adaptive Domain- and Difficulty-Aware Reinforcement Learning on Imbalanced Data
In The 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025, 2025
Yuhang Zhou and Jing Zhu and Shengyi Qian and Zhuokai Zhao and Xiyao Wang and Xiaoyu Liu and Ming Li and Paiheng Xu and Wei Ai and Furong Huang
Publisher's website
Yuhang Zhou and Jing Zhu and Shengyi Qian and Zhuokai Zhao and Xiyao Wang and Xiaoyu Liu and Ming Li and Paiheng Xu and Wei Ai and Furong Huang
Imagine, Verify, Execute: Agentic Exploration with Vision-Language Models
In 9th Annual Conference on Robot Learning (CoRL), 2025, 2025
Seungjae^ Lee and Daniel Ekpo^ and Haowen Liu and Furong Huang and Abhinav Shrivastava and Jia-Bin Huang
Publisher's website
Seungjae^ Lee and Daniel Ekpo^ and Haowen Liu and Furong Huang and Abhinav Shrivastava and Jia-Bin Huang
