SciFact-Open: Towards open-domain scientific claim verification

SciFact-Open: Towards open-domain scientific claim verification. Wadden, D., Lo, K., Kuehl, B., Cohan, A., Beltagy, I., Wang, L. L., & Hajishirzi, H. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 4719–4734, Abu Dhabi, United Arab Emirates, December, 2022. Association for Computational Linguistics.

Paper doi abstract bibtex

While research on scientific claim verification has led to the development of powerful systems that appear to approach human performance, these approaches have yet to be tested in a realistic setting against large corpora of scientific literature. Moving to this open-domain evaluation setting, however, poses unique challenges; in particular, it is infeasible to exhaustively annotate all evidence documents. In this work, we present SciFact-Open, a new test collection designed to evaluate the performance of scientific claim verification systems on a corpus of 500K research abstracts. Drawing upon pooling techniques from information retrieval, we collect evidence for scientific claims by pooling and annotating the top predictions of four state-of-the-art scientific claim verification models. We find that systems developed on smaller corpora struggle to generalize to SciFact-Open, exhibiting performance drops of at least 15 F1. In addition, analysis of the evidence in SciFact-Open reveals interesting phenomena likely to appear when claim verification systems are deployed in practice, e.g., cases where the evidence supports only a special case of the claim. Our dataset is available at https://github.com/dwadden/scifact-open.

@inproceedings{wadden_scifact-open_2022,
	address = {Abu Dhabi, United Arab Emirates},
	title = {{SciFact}-{Open}: {Towards} open-domain scientific claim verification},
	shorttitle = {{SciFact}-{Open}},
	url = {https://aclanthology.org/2022.findings-emnlp.347},
	doi = {10.18653/v1/2022.findings-emnlp.347},
	abstract = {While research on scientific claim verification has led to the development of powerful systems that appear to approach human performance, these approaches have yet to be tested in a realistic setting against large corpora of scientific literature. Moving to this open-domain evaluation setting, however, poses unique challenges; in particular, it is infeasible to exhaustively annotate all evidence documents. In this work, we present SciFact-Open, a new test collection designed to evaluate the performance of scientific claim verification systems on a corpus of 500K research abstracts. Drawing upon pooling techniques from information retrieval, we collect evidence for scientific claims by pooling and annotating the top predictions of four state-of-the-art scientific claim verification models. We find that systems developed on smaller corpora struggle to generalize to SciFact-Open, exhibiting performance drops of at least 15 F1. In addition, analysis of the evidence in SciFact-Open reveals interesting phenomena likely to appear when claim verification systems are deployed in practice, e.g., cases where the evidence supports only a special case of the claim. Our dataset is available at https://github.com/dwadden/scifact-open.},
	urldate = {2023-10-13},
	booktitle = {Findings of the {Association} for {Computational} {Linguistics}: {EMNLP} 2022},
	publisher = {Association for Computational Linguistics},
	author = {Wadden, David and Lo, Kyle and Kuehl, Bailey and Cohan, Arman and Beltagy, Iz and Wang, Lucy Lu and Hajishirzi, Hannaneh},
	month = dec,
	year = {2022},
	pages = {4719--4734},
}

Downloads: 0

{"_id":"ToL54sk4XrJn8gC8z","bibbaseid":"wadden-lo-kuehl-cohan-beltagy-wang-hajishirzi-scifactopentowardsopendomainscientificclaimverification-2022","author_short":["Wadden, D.","Lo, K.","Kuehl, B.","Cohan, A.","Beltagy, I.","Wang, L. L.","Hajishirzi, H."],"bibdata":{"bibtype":"inproceedings","type":"inproceedings","address":"Abu Dhabi, United Arab Emirates","title":"SciFact-Open: Towards open-domain scientific claim verification","shorttitle":"SciFact-Open","url":"https://aclanthology.org/2022.findings-emnlp.347","doi":"10.18653/v1/2022.findings-emnlp.347","abstract":"While research on scientific claim verification has led to the development of powerful systems that appear to approach human performance, these approaches have yet to be tested in a realistic setting against large corpora of scientific literature. Moving to this open-domain evaluation setting, however, poses unique challenges; in particular, it is infeasible to exhaustively annotate all evidence documents. In this work, we present SciFact-Open, a new test collection designed to evaluate the performance of scientific claim verification systems on a corpus of 500K research abstracts. Drawing upon pooling techniques from information retrieval, we collect evidence for scientific claims by pooling and annotating the top predictions of four state-of-the-art scientific claim verification models. We find that systems developed on smaller corpora struggle to generalize to SciFact-Open, exhibiting performance drops of at least 15 F1. In addition, analysis of the evidence in SciFact-Open reveals interesting phenomena likely to appear when claim verification systems are deployed in practice, e.g., cases where the evidence supports only a special case of the claim. Our dataset is available at https://github.com/dwadden/scifact-open.","urldate":"2023-10-13","booktitle":"Findings of the Association for Computational Linguistics: EMNLP 2022","publisher":"Association for Computational Linguistics","author":[{"propositions":[],"lastnames":["Wadden"],"firstnames":["David"],"suffixes":[]},{"propositions":[],"lastnames":["Lo"],"firstnames":["Kyle"],"suffixes":[]},{"propositions":[],"lastnames":["Kuehl"],"firstnames":["Bailey"],"suffixes":[]},{"propositions":[],"lastnames":["Cohan"],"firstnames":["Arman"],"suffixes":[]},{"propositions":[],"lastnames":["Beltagy"],"firstnames":["Iz"],"suffixes":[]},{"propositions":[],"lastnames":["Wang"],"firstnames":["Lucy","Lu"],"suffixes":[]},{"propositions":[],"lastnames":["Hajishirzi"],"firstnames":["Hannaneh"],"suffixes":[]}],"month":"December","year":"2022","pages":"4719–4734","bibtex":"@inproceedings{wadden_scifact-open_2022,\n\taddress = {Abu Dhabi, United Arab Emirates},\n\ttitle = {{SciFact}-{Open}: {Towards} open-domain scientific claim verification},\n\tshorttitle = {{SciFact}-{Open}},\n\turl = {https://aclanthology.org/2022.findings-emnlp.347},\n\tdoi = {10.18653/v1/2022.findings-emnlp.347},\n\tabstract = {While research on scientific claim verification has led to the development of powerful systems that appear to approach human performance, these approaches have yet to be tested in a realistic setting against large corpora of scientific literature. Moving to this open-domain evaluation setting, however, poses unique challenges; in particular, it is infeasible to exhaustively annotate all evidence documents. In this work, we present SciFact-Open, a new test collection designed to evaluate the performance of scientific claim verification systems on a corpus of 500K research abstracts. Drawing upon pooling techniques from information retrieval, we collect evidence for scientific claims by pooling and annotating the top predictions of four state-of-the-art scientific claim verification models. We find that systems developed on smaller corpora struggle to generalize to SciFact-Open, exhibiting performance drops of at least 15 F1. In addition, analysis of the evidence in SciFact-Open reveals interesting phenomena likely to appear when claim verification systems are deployed in practice, e.g., cases where the evidence supports only a special case of the claim. Our dataset is available at https://github.com/dwadden/scifact-open.},\n\turldate = {2023-10-13},\n\tbooktitle = {Findings of the {Association} for {Computational} {Linguistics}: {EMNLP} 2022},\n\tpublisher = {Association for Computational Linguistics},\n\tauthor = {Wadden, David and Lo, Kyle and Kuehl, Bailey and Cohan, Arman and Beltagy, Iz and Wang, Lucy Lu and Hajishirzi, Hannaneh},\n\tmonth = dec,\n\tyear = {2022},\n\tpages = {4719--4734},\n}\n\n","author_short":["Wadden, D.","Lo, K.","Kuehl, B.","Cohan, A.","Beltagy, I.","Wang, L. L.","Hajishirzi, H."],"key":"wadden_scifact-open_2022","id":"wadden_scifact-open_2022","bibbaseid":"wadden-lo-kuehl-cohan-beltagy-wang-hajishirzi-scifactopentowardsopendomainscientificclaimverification-2022","role":"author","urls":{"Paper":"https://aclanthology.org/2022.findings-emnlp.347"},"metadata":{"authorlinks":{}},"html":""},"bibtype":"inproceedings","biburl":"https://bibbase.org/zotero/ifromm","dataSources":["N4kJAiLiJ7kxfNsoh"],"keywords":[],"search_terms":["scifact","open","towards","open","domain","scientific","claim","verification","wadden","lo","kuehl","cohan","beltagy","wang","hajishirzi"],"title":"SciFact-Open: Towards open-domain scientific claim verification","year":2022}