In Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pages 1321–1324, Ann Arbor, MI, USA, June, 2018. Venue Rating: CORE A*Paper Demo doi abstract bibtex
Current plagiarism detection systems reliably find instances of copied and moderately altered text, but often fail to detect strong paraphrases, translations, and the reuse of non-textual content and ideas. To improve upon the detection capabilities for such concealed content reuse in academic publications, we make four contributions: i) We present the first plagiarism detection approach that combines the analysis of mathematical expressions, images, citations and text. ii) We describe the implementation of this hybrid detection approach in the research prototype HyPlag. iii) We present novel visualization and interaction concepts to aid users in reviewing content similarities identified by the hybrid detection approach. iv) We demonstrate the usefulness of the hybrid detection and result visualization approaches by using HyPlag to analyze a confirmed case of content reuse present in a retracted research publication.