CITREC: An Evaluation Framework for Citation-Based Similarity Measures based on TREC Genomics and PubMed Central. Gipp, B., Meuschke, N., & Lipinski, M. In Proceedings of the iConference, Newport Beach, California, March, 2015.
CITREC: An Evaluation Framework for Citation-Based Similarity Measures based on TREC Genomics and PubMed Central [pdf]Paper  CITREC: An Evaluation Framework for Citation-Based Similarity Measures based on TREC Genomics and PubMed Central [link]Code  CITREC: An Evaluation Framework for Citation-Based Similarity Measures based on TREC Genomics and PubMed Central [link]Data  doi  abstract   bibtex   
Citation-based similarity measures such as Bibliographic Coupling and Co-Citation are an integral component of many information retrieval systems. However, comparisons of the strengths and weaknesses of measures are challenging due to the lack of suitable test collections. This paper presents CITREC, an open evaluation framework for citation-based and text-based similarity measures. CITREC prepares the data from the PubMed Central Open Access Subset and the TREC Genomics collection for a citation-based analysis and provides tools necessary for performing evaluations of similarity measures. To account for different evaluation purposes, CITREC implements 35 citation-based and text-based similarity measures, and features two gold standards. The first gold standard uses the Medical Subject Headings (MeSH) thesaurus and the second uses the expert relevance feedback that is part of the TREC Genomics collection to gauge similarity. CITREC additionally offers a system that allows creating user defined gold standards to adapt the evaluation framework to individual information needs and evaluation purposes.

Downloads: 0