Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphrase Detection. Wahle, J. P., Ruas, T., Meuschke, N., & Gipp, B. In 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL), pages 226–229.
Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphrase Detection [link]Paper  doi  abstract   bibtex   
The rise of language models such as BERT allows for high-quality text paraphrasing. This is a problem to academic integrity, as it is difficult to differentiate between original and machine-generated content. We propose a benchmark consisting of paraphrased articles using recent language models relying on the Transformer architecture. Our contribution fosters future research of paraphrase detection systems as it offers a large collection of aligned original and paraphrased documents, a study regarding its structure, classification experiments with state-of-the-art systems, and we make our findings publicly available.
@inproceedings{WahleRMG21,
  title = {Are {{Neural Language Models Good Plagiarists}}? {{A Benchmark}} for {{Neural Paraphrase Detection}}},
  shorttitle = {Are {{Neural Language Models Good Plagiarists}}?},
  booktitle = {2021 {{ACM}}/{{IEEE Joint Conference}} on {{Digital Libraries}} ({{JCDL}})},
  author = {Wahle, Jan Philip and Ruas, Terry and Meuschke, Norman and Gipp, Bela},
  date = {2021-09},
  eprint = {2103.12450},
  eprinttype = {arxiv},
  eprintclass = {cs},
  pages = {226--229},
  doi = {10.1109/JCDL52503.2021.00065},
  url = {https://aclanthology.org/2022.emnlp-main.62},
  urldate = {2022-11-04},
  abstract = {The rise of language models such as BERT allows for high-quality text paraphrasing. This is a problem to academic integrity, as it is difficult to differentiate between original and machine-generated content. We propose a benchmark consisting of paraphrased articles using recent language models relying on the Transformer architecture. Our contribution fosters future research of paraphrase detection systems as it offers a large collection of aligned original and paraphrased documents, a study regarding its structure, classification experiments with state-of-the-art systems, and we make our findings publicly available.},
  keywords = {Computer Science - Artificial Intelligence,Computer Science - Computation and Language,Computer Science - Digital Libraries},
  file = {C\:\\Users\\ruast\\Zotero\\storage\\7P6V49B7\\WahleRMG21--tr--are_neural_language_models_good_plagiarists_a_benchmark_for_neural_paraphrase_detection.pdf;C\:\\Users\\ruast\\Zotero\\storage\\CXPL4A8B\\2103.html}
}

Downloads: 0