Reducing systematic review burden using Deduklick: a novel, automated, reliable, and explainable deduplication algorithm to foster medical research

Reducing systematic review burden using Deduklick: a novel, automated, reliable, and explainable deduplication algorithm to foster medical research. Borissov, N., Haas, Q., Minder, B., Kopp-Heim, D., Von Gernler, M., Janka, H., Teodoro, D., & Amini, P. Systematic Reviews, 11(1):172, August, 2022.

Paper doi abstract bibtex

Abstract Background Identifying and removing reference duplicates when conducting systematic reviews (SRs) remain a major, time-consuming issue for authors who manually check for duplicates using built-in features in citation managers. To address issues related to manual deduplication, we developed an automated, efficient, and rapid artificial intelligence-based algorithm named Deduklick. Deduklick combines natural language processing algorithms with a set of rules created by expert information specialists. Methods Deduklick’s deduplication uses a multistep algorithm of data normalization, calculates a similarity score, and identifies unique and duplicate references based on metadata fields, such as title, authors, journal, DOI, year, issue, volume, and page number range. We measured and compared Deduklick’s capacity to accurately detect duplicates with the information specialists’ standard, manual duplicate removal process using EndNote on eight existing heterogeneous datasets. Using a sensitivity analysis, we manually cross-compared the efficiency and noise of both methods. Discussion Deduklick achieved average recall of 99.51%, average precision of 100.00%, and average F1 score of 99.75%. In contrast, the manual deduplication process achieved average recall of 88.65%, average precision of 99.95%, and average F1 score of 91.98%. Deduklick achieved equal to higher expert-level performance on duplicate removal. It also preserved high metadata quality and drastically reduced time spent on analysis. Deduklick represents an efficient, transparent, ergonomic, and time-saving solution for identifying and removing duplicates in SRs searches. Deduklick could therefore simplify SRs production and represent important advantages for scientists, including saving time, increasing accuracy, reducing costs, and contributing to quality SRs.

@article{borissov2022ReducingSystematic,
	title = {Reducing systematic review burden using {Deduklick}: a novel, automated, reliable, and explainable deduplication algorithm to foster medical research},
	volume = {11},
	issn = {2046-4053},
	shorttitle = {Reducing systematic review burden using {Deduklick}},
	url = {https://systematicreviewsjournal.biomedcentral.com/articles/10.1186/s13643-022-02045-9},
	doi = {10.1186/s13643-022-02045-9},
	abstract = {Abstract
            
              Background
              Identifying and removing reference duplicates when conducting systematic reviews (SRs) remain a major, time-consuming issue for authors who manually check for duplicates using built-in features in citation managers. To address issues related to manual deduplication, we developed an automated, efficient, and rapid artificial intelligence-based algorithm named Deduklick. Deduklick combines natural language processing algorithms with a set of rules created by expert information specialists.
            
            
              Methods
              Deduklick’s deduplication uses a multistep algorithm of data normalization, calculates a similarity score, and identifies unique and duplicate references based on metadata fields, such as title, authors, journal, DOI, year, issue, volume, and page number range. We measured and compared Deduklick’s capacity to accurately detect duplicates with the information specialists’ standard, manual duplicate removal process using EndNote on eight existing heterogeneous datasets. Using a sensitivity analysis, we manually cross-compared the efficiency and noise of both methods.
            
            
              Discussion
              Deduklick achieved average recall of 99.51\%, average precision of 100.00\%, and average F1 score of 99.75\%. In contrast, the manual deduplication process achieved average recall of 88.65\%, average precision of 99.95\%, and average F1 score of 91.98\%. Deduklick achieved equal to higher expert-level performance on duplicate removal. It also preserved high metadata quality and drastically reduced time spent on analysis. Deduklick represents an efficient, transparent, ergonomic, and time-saving solution for identifying and removing duplicates in SRs searches. Deduklick could therefore simplify SRs production and represent important advantages for scientists, including saving time, increasing accuracy, reducing costs, and contributing to quality SRs.},
	language = {en},
	number = {1},
	urldate = {2025-10-03},
	journal = {Systematic Reviews},
	author = {Borissov, Nikolay and Haas, Quentin and Minder, Beatrice and Kopp-Heim, Doris and Von Gernler, Marc and Janka, Heidrun and Teodoro, Douglas and Amini, Poorya},
	month = aug,
	year = {2022},
	keywords = {\_annoté\_FF},
	pages = {172},
}

Downloads: 0

{"_id":"fpdNLXsTXy4CzaAvr","bibbaseid":"borissov-haas-minder-koppheim-vongernler-janka-teodoro-amini-reducingsystematicreviewburdenusingdeduklickanovelautomatedreliableandexplainablededuplicationalgorithmtofostermedicalresearch-2022","author_short":["Borissov, N.","Haas, Q.","Minder, B.","Kopp-Heim, D.","Von Gernler, M.","Janka, H.","Teodoro, D.","Amini, P."],"bibdata":{"bibtype":"article","type":"article","title":"Reducing systematic review burden using Deduklick: a novel, automated, reliable, and explainable deduplication algorithm to foster medical research","volume":"11","issn":"2046-4053","shorttitle":"Reducing systematic review burden using Deduklick","url":"https://systematicreviewsjournal.biomedcentral.com/articles/10.1186/s13643-022-02045-9","doi":"10.1186/s13643-022-02045-9","abstract":"Abstract Background Identifying and removing reference duplicates when conducting systematic reviews (SRs) remain a major, time-consuming issue for authors who manually check for duplicates using built-in features in citation managers. To address issues related to manual deduplication, we developed an automated, efficient, and rapid artificial intelligence-based algorithm named Deduklick. Deduklick combines natural language processing algorithms with a set of rules created by expert information specialists. Methods Deduklick’s deduplication uses a multistep algorithm of data normalization, calculates a similarity score, and identifies unique and duplicate references based on metadata fields, such as title, authors, journal, DOI, year, issue, volume, and page number range. We measured and compared Deduklick’s capacity to accurately detect duplicates with the information specialists’ standard, manual duplicate removal process using EndNote on eight existing heterogeneous datasets. Using a sensitivity analysis, we manually cross-compared the efficiency and noise of both methods. Discussion Deduklick achieved average recall of 99.51%, average precision of 100.00%, and average F1 score of 99.75%. In contrast, the manual deduplication process achieved average recall of 88.65%, average precision of 99.95%, and average F1 score of 91.98%. Deduklick achieved equal to higher expert-level performance on duplicate removal. It also preserved high metadata quality and drastically reduced time spent on analysis. Deduklick represents an efficient, transparent, ergonomic, and time-saving solution for identifying and removing duplicates in SRs searches. Deduklick could therefore simplify SRs production and represent important advantages for scientists, including saving time, increasing accuracy, reducing costs, and contributing to quality SRs.","language":"en","number":"1","urldate":"2025-10-03","journal":"Systematic Reviews","author":[{"propositions":[],"lastnames":["Borissov"],"firstnames":["Nikolay"],"suffixes":[]},{"propositions":[],"lastnames":["Haas"],"firstnames":["Quentin"],"suffixes":[]},{"propositions":[],"lastnames":["Minder"],"firstnames":["Beatrice"],"suffixes":[]},{"propositions":[],"lastnames":["Kopp-Heim"],"firstnames":["Doris"],"suffixes":[]},{"propositions":[],"lastnames":["Von","Gernler"],"firstnames":["Marc"],"suffixes":[]},{"propositions":[],"lastnames":["Janka"],"firstnames":["Heidrun"],"suffixes":[]},{"propositions":[],"lastnames":["Teodoro"],"firstnames":["Douglas"],"suffixes":[]},{"propositions":[],"lastnames":["Amini"],"firstnames":["Poorya"],"suffixes":[]}],"month":"August","year":"2022","keywords":"_annoté_FF","pages":"172","bibtex":"@article{borissov2022ReducingSystematic,\n\ttitle = {Reducing systematic review burden using {Deduklick}: a novel, automated, reliable, and explainable deduplication algorithm to foster medical research},\n\tvolume = {11},\n\tissn = {2046-4053},\n\tshorttitle = {Reducing systematic review burden using {Deduklick}},\n\turl = {https://systematicreviewsjournal.biomedcentral.com/articles/10.1186/s13643-022-02045-9},\n\tdoi = {10.1186/s13643-022-02045-9},\n\tabstract = {Abstract\n \n Background\n Identifying and removing reference duplicates when conducting systematic reviews (SRs) remain a major, time-consuming issue for authors who manually check for duplicates using built-in features in citation managers. To address issues related to manual deduplication, we developed an automated, efficient, and rapid artificial intelligence-based algorithm named Deduklick. Deduklick combines natural language processing algorithms with a set of rules created by expert information specialists.\n \n \n Methods\n Deduklick’s deduplication uses a multistep algorithm of data normalization, calculates a similarity score, and identifies unique and duplicate references based on metadata fields, such as title, authors, journal, DOI, year, issue, volume, and page number range. We measured and compared Deduklick’s capacity to accurately detect duplicates with the information specialists’ standard, manual duplicate removal process using EndNote on eight existing heterogeneous datasets. Using a sensitivity analysis, we manually cross-compared the efficiency and noise of both methods.\n \n \n Discussion\n Deduklick achieved average recall of 99.51\\%, average precision of 100.00\\%, and average F1 score of 99.75\\%. In contrast, the manual deduplication process achieved average recall of 88.65\\%, average precision of 99.95\\%, and average F1 score of 91.98\\%. Deduklick achieved equal to higher expert-level performance on duplicate removal. It also preserved high metadata quality and drastically reduced time spent on analysis. Deduklick represents an efficient, transparent, ergonomic, and time-saving solution for identifying and removing duplicates in SRs searches. Deduklick could therefore simplify SRs production and represent important advantages for scientists, including saving time, increasing accuracy, reducing costs, and contributing to quality SRs.},\n\tlanguage = {en},\n\tnumber = {1},\n\turldate = {2025-10-03},\n\tjournal = {Systematic Reviews},\n\tauthor = {Borissov, Nikolay and Haas, Quentin and Minder, Beatrice and Kopp-Heim, Doris and Von Gernler, Marc and Janka, Heidrun and Teodoro, Douglas and Amini, Poorya},\n\tmonth = aug,\n\tyear = {2022},\n\tkeywords = {\\_annoté\\_FF},\n\tpages = {172},\n}\n\n","author_short":["Borissov, N.","Haas, Q.","Minder, B.","Kopp-Heim, D.","Von Gernler, M.","Janka, H.","Teodoro, D.","Amini, P."],"key":"borissov2022ReducingSystematic","id":"borissov2022ReducingSystematic","bibbaseid":"borissov-haas-minder-koppheim-vongernler-janka-teodoro-amini-reducingsystematicreviewburdenusingdeduklickanovelautomatedreliableandexplainablededuplicationalgorithmtofostermedicalresearch-2022","role":"author","urls":{"Paper":"https://systematicreviewsjournal.biomedcentral.com/articles/10.1186/s13643-022-02045-9"},"keyword":["_annoté_FF"],"metadata":{"authorlinks":{}}},"bibtype":"article","biburl":"https://api.zotero.org/groups/2233096/items?key=rCy7E0GkmFhOXB78i3c8F8Sm&format=bibtex&limit=100","dataSources":["zXnPFaELWSWS53n4g","hDfoTWMe7fEMeRgR9","dShEdNT7Rd3kFsqCx","Ncv7RbP24oZovD5Kd"],"keywords":["_annoté_ff"],"search_terms":["reducing","systematic","review","burden","using","deduklick","novel","automated","reliable","explainable","deduplication","algorithm","foster","medical","research","borissov","haas","minder","kopp-heim","von gernler","janka","teodoro","amini"],"title":"Reducing systematic review burden using Deduklick: a novel, automated, reliable, and explainable deduplication algorithm to foster medical research","year":2022}