Can AI-Generated Text be Reliably Detected?. Sadasivan, V. S., Kumar, A., Balasubramanian, S., Wang, W., & Feizi, S. June, 2023. arXiv:2303.11156 [cs]
Can AI-Generated Text be Reliably Detected? [link]Paper  doi  abstract   bibtex   
Sadasivan et al. (2023) investigate the reliability of current methods for detecting AI-generated text. The authors argue that the rapid progress of large language models (LLMs) has led to potential misuse, such as plagiarism and fake news generation, necessitating reliable detection methods. However, they demonstrate both empirically and theoretically that current detection methods, including watermarking techniques and neural network-based detectors, are not reliable in practical scenarios. They show that simple paraphrasing attacks can significantly degrade the performance of these detectors. The authors also present a theoretical impossibility result, suggesting that as LLMs become more sophisticated and better at emulating human text, the performance of even the best-possible detector decreases. For a sufficiently advanced language model seeking to imitate human text, even the best-possible detector may only perform marginally better than a random classifier. In terms of policy implications, this paper suggests that lawmakers and regulators need to be aware of the limitations of current AI-generated text detection methods. As AI models become more sophisticated, traditional detection methods may become less effective, potentially leading to an increase in the misuse of AI for unethical purposes. Policymakers may need to consider alternative strategies for regulating the use of AI and mitigating potential harms, such as developing new detection technologies, implementing stricter controls on the use of AI for text generation, or promoting transparency and accountability in AI development. (AI + Policy, AI Text generation cannot be reliably detected)
@misc{sadasivan_can_2023,
	title = {Can {AI}-{Generated} {Text} be {Reliably} {Detected}?},
	url = {http://arxiv.org/abs/2303.11156},
	doi = {10.48550/arXiv.2303.11156},
	abstract = {Sadasivan et al. (2023) investigate the reliability of current methods for detecting AI-generated text. The authors argue that the rapid progress of large language models (LLMs) has led to potential misuse, such as plagiarism and fake news generation, necessitating reliable detection methods. However, they demonstrate both empirically and theoretically that current detection methods, including watermarking techniques and neural network-based detectors, are not reliable in practical scenarios. They show that simple paraphrasing attacks can significantly degrade the performance of these detectors. The authors also present a theoretical impossibility result, suggesting that as LLMs become more sophisticated and better at emulating human text, the performance of even the best-possible detector decreases. For a sufficiently advanced language model seeking to imitate human text, even the best-possible detector may only perform marginally better than a random classifier. 
In terms of policy implications, this paper suggests that lawmakers and regulators need to be aware of the limitations of current AI-generated text detection methods. As AI models become more sophisticated, traditional detection methods may become less effective, potentially leading to an increase in the misuse of AI for unethical purposes. Policymakers may need to consider alternative strategies for regulating the use of AI and mitigating potential harms, such as developing new detection technologies, implementing stricter controls on the use of AI for text generation, or promoting transparency and accountability in AI development. (AI + Policy, AI Text generation cannot be reliably detected)},
	urldate = {2024-01-26},
	publisher = {arXiv},
	author = {Sadasivan, Vinu Sankar and Kumar, Aounon and Balasubramanian, Sriram and Wang, Wenxiao and Feizi, Soheil},
	month = jun,
	year = {2023},
	note = {arXiv:2303.11156 [cs]},
	keywords = {20, AI reliability and robustness, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Machine Learning, gw\_abstracts},
}

Downloads: 0