Furong Huang
Associate Professor @ University of Maryland
Home
Publications
Research
Project Page Highlights
Students
Teaching
Blog
Contact
Students on the Job Market
Position Paper: On the Possibilities of AI-Generated Text Detection
Publications
Year
2024
Type(s)
Conference proceedings
Author(s)
Chakraborty, Souradip, Amrit Bedi, Sicheng Zhu, Bang An, Dinesh Manocha, and Furong Huang
Source
Proceedings of the 41st International Conference on Machine Learning (ICML), 2024.
BibTeX
BibTeX
BibTeX
@InProceedings{pmlr-v235-chakraborty24a, title = {Position: On the Possibilities of {AI}-Generated Text Detection}, author = {Chakraborty, Souradip and Bedi, Amrit and Zhu, Sicheng and An, Bang and Manocha, Dinesh and Huang, Furong}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {6093--6115}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/chakraborty24a/chakraborty24a.pdf}, url = {https://proceedings.mlr.press/v235/chakraborty24a.html}, abstract = {Our study addresses the challenge of distinguishing human-written text from Large Language Model (LLM) outputs. We provide evidence that this differentiation is consistently feasible, except when human and machine text distributions are indistinguishable across their entire support. Employing information theory, we show that while detecting machine-generated text becomes harder as it nears human quality, it remains possible with adequate text data. We introduce guidelines on the required text data quantity, either through sample size or sequence length, for reliable AI text detection, through derivations of sample complexity bounds. This research paves the way for advanced detection methods. Our comprehensive empirical tests, conducted across various datasets (Xsum, Squad, IMDb, and Kaggle FakeNews) and with several state-of-the-art text generators (GPT-2, GPT-3.5-Turbo, Llama, Llama-2-13B-Chat-HF, Llama-2-70B-Chat-HF), assess the viability of enhanced detection methods against detectors like RoBERTa-Large/Base-Detector and GPTZero, with increasing sample sizes and sequence lengths. Our findings align with OpenAI’s empirical data related to sequence length, marking the first theoretical substantiation for these observations.} }