Heads we win, tails you lose: AI detectors in education. Bassett, M. A, Bradshaw, W., Hogg, A., Murdoch, K., Bornsztejn, H., Pearce, B., & Webber, C. October, 2025.
Heads we win, tails you lose: AI detectors in education. [link]Paper  doi  abstract   bibtex   
The increasing use of generative artificial intelligence in student assessment has led to reliance on generative artificial intelligence detection tools by educational institutions. Unlike plagiarism detection, which identifies direct matches, AI detectors rely on unverifiable probabilistic assessments. False positives are indistinguishable from genuine cases. In this paper, we argue that generative artificial intelligence detection should not be used in education due to its methodological imperfections, violation of procedural fairness, and unverifiable outputs. Generative artificial intelligence detectors cannot be tested in real-world conditions where the true origin of a text is unknown. Attempts to validate results through linguistic markers, multiple tools, or comparisons with past work introduce confirmation bias rather than independent verification. Moreover, categorising text as human- or AI-generated imposes a false dichotomy that ignores work created with, not by, AI. Generative artificial intelligence detection also raises security concerns, as many tools lack transparency regarding data security. Academic integrity investigations must rely on evidence meeting the balance of probabilities standard, which generative artificial intelligence detection scores do not satisfy. Educational institutions and sectors should move away from punitive detection policies and focus on assessment design that integrates AI’s role in learning, ensuring fairness, transparency, and pedagogical integrity.
@misc{bassett_heads_2025,
	title = {Heads we win, tails you lose: {AI} detectors in education.},
	shorttitle = {Heads we win, tails you lose},
	url = {https://osf.io/preprints/edarxiv/93w6j_v1/},
	doi = {10.35542/osf.io/93w6j_v1},
	abstract = {The increasing use of generative artificial intelligence in student assessment has led to reliance on generative artificial intelligence detection tools by educational institutions. Unlike plagiarism detection, which identifies direct matches, AI detectors rely on unverifiable probabilistic assessments. False positives are indistinguishable from genuine cases. In this paper, we argue that generative artificial intelligence detection should not be used in education due to its methodological imperfections, violation of procedural fairness, and unverifiable outputs. Generative artificial intelligence detectors cannot be tested in real-world conditions where the true origin of a text is unknown. Attempts to validate results through linguistic markers, multiple tools, or comparisons with past work introduce confirmation bias rather than independent verification. Moreover, categorising text as human- or AI-generated imposes a false dichotomy that ignores work created with, not by, AI. Generative artificial intelligence detection also raises security concerns, as many tools lack transparency regarding data security. Academic integrity investigations must rely on evidence meeting the balance of probabilities standard, which generative artificial intelligence detection scores do not satisfy. Educational institutions and sectors should move away from punitive detection policies and focus on assessment design that integrates AI’s role in learning, ensuring fairness, transparency, and pedagogical integrity.},
	urldate = {2025-12-01},
	publisher = {EdArXiv},
	author = {Bassett, Mark A and Bradshaw, Wayne and Hogg, Alyce and Murdoch, Kane and Bornsztejn, Hannah and Pearce, Bridget and Webber, Colin},
	month = oct,
	year = {2025},
	keywords = {Artificial intelligence detection, academic integrity, generative artificial intelligence, higher education},
}

Downloads: 0