Incorporating quality metrics in centralized/distributed information retrieval on the World Wide Web. Zhu, X. & Gauch, S. In Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, of SIGIR '00, pages 288–295, New York, NY, USA, July, 2000. Association for Computing Machinery.
Paper doi abstract bibtex Most information retrieval systems on the Internet rely primarily on similarity ranking algorithms based solely on term frequency statistics. Information quality is usually ignored. This leads to the problem that documents are retrieved without regard to their quality. We present an approach that combines similarity-based similarity ranking with quality ranking in centralized and distributed search environments. Six quality metrics, including the currency, availability, information-to-noise ratio, authority, popularity, and cohesiveness, were investigated. Search effectiveness was significantly improved when the currency, availability, information-to-noise ratio and page cohesiveness metrics were incorporated in centralized search. The improvement seen when the availability, information-to- noise ratio, popularity, and cohesiveness metrics were incorporated in site selection was also significant. Finally, incorporating the popularity metric in information fusion resulted in a significant improvement. In summary, the results show that incorporating quality metrics can generally improve search effectiveness in both centralized and distributed search environments.
@inproceedings{zhu_incorporating_2000,
address = {New York, NY, USA},
series = {{SIGIR} '00},
title = {Incorporating quality metrics in centralized/distributed information retrieval on the {World} {Wide} {Web}},
isbn = {978-1-58113-226-7},
url = {https://dl.acm.org/doi/10.1145/345508.345602},
doi = {10.1145/345508.345602},
abstract = {Most information retrieval systems on the Internet rely primarily on similarity ranking algorithms based solely on term frequency statistics. Information quality is usually ignored. This leads to the problem that documents are retrieved without regard to their quality. We present an approach that combines similarity-based similarity ranking with quality ranking in centralized and distributed search environments. Six quality metrics, including the currency, availability, information-to-noise ratio, authority, popularity, and cohesiveness, were investigated. Search effectiveness was significantly improved when the currency, availability, information-to-noise ratio and page cohesiveness metrics were incorporated in centralized search. The improvement seen when the availability, information-to- noise ratio, popularity, and cohesiveness metrics were incorporated in site selection was also significant. Finally, incorporating the popularity metric in information fusion resulted in a significant improvement. In summary, the results show that incorporating quality metrics can generally improve search effectiveness in both centralized and distributed search environments.},
urldate = {2023-10-13},
booktitle = {Proceedings of the 23rd annual international {ACM} {SIGIR} conference on {Research} and development in information retrieval},
publisher = {Association for Computing Machinery},
author = {Zhu, Xiaolan and Gauch, Susan},
month = jul,
year = {2000},
keywords = {4M\_Data Quality Metrics, 4M\_Research Data Management},
pages = {288--295},
}
Downloads: 0
{"_id":"k3jQuPLn5YvAta63E","bibbaseid":"zhu-gauch-incorporatingqualitymetricsincentralizeddistributedinformationretrievalontheworldwideweb-2000","authorIDs":[],"author_short":["Zhu, X.","Gauch, S."],"bibdata":{"bibtype":"inproceedings","type":"inproceedings","address":"New York, NY, USA","series":"SIGIR '00","title":"Incorporating quality metrics in centralized/distributed information retrieval on the World Wide Web","isbn":"978-1-58113-226-7","url":"https://dl.acm.org/doi/10.1145/345508.345602","doi":"10.1145/345508.345602","abstract":"Most information retrieval systems on the Internet rely primarily on similarity ranking algorithms based solely on term frequency statistics. Information quality is usually ignored. This leads to the problem that documents are retrieved without regard to their quality. We present an approach that combines similarity-based similarity ranking with quality ranking in centralized and distributed search environments. Six quality metrics, including the currency, availability, information-to-noise ratio, authority, popularity, and cohesiveness, were investigated. Search effectiveness was significantly improved when the currency, availability, information-to-noise ratio and page cohesiveness metrics were incorporated in centralized search. The improvement seen when the availability, information-to- noise ratio, popularity, and cohesiveness metrics were incorporated in site selection was also significant. Finally, incorporating the popularity metric in information fusion resulted in a significant improvement. In summary, the results show that incorporating quality metrics can generally improve search effectiveness in both centralized and distributed search environments.","urldate":"2023-10-13","booktitle":"Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval","publisher":"Association for Computing Machinery","author":[{"propositions":[],"lastnames":["Zhu"],"firstnames":["Xiaolan"],"suffixes":[]},{"propositions":[],"lastnames":["Gauch"],"firstnames":["Susan"],"suffixes":[]}],"month":"July","year":"2000","keywords":"4M_Data Quality Metrics, 4M_Research Data Management","pages":"288–295","bibtex":"@inproceedings{zhu_incorporating_2000,\n\taddress = {New York, NY, USA},\n\tseries = {{SIGIR} '00},\n\ttitle = {Incorporating quality metrics in centralized/distributed information retrieval on the {World} {Wide} {Web}},\n\tisbn = {978-1-58113-226-7},\n\turl = {https://dl.acm.org/doi/10.1145/345508.345602},\n\tdoi = {10.1145/345508.345602},\n\tabstract = {Most information retrieval systems on the Internet rely primarily on similarity ranking algorithms based solely on term frequency statistics. Information quality is usually ignored. This leads to the problem that documents are retrieved without regard to their quality. We present an approach that combines similarity-based similarity ranking with quality ranking in centralized and distributed search environments. Six quality metrics, including the currency, availability, information-to-noise ratio, authority, popularity, and cohesiveness, were investigated. Search effectiveness was significantly improved when the currency, availability, information-to-noise ratio and page cohesiveness metrics were incorporated in centralized search. The improvement seen when the availability, information-to- noise ratio, popularity, and cohesiveness metrics were incorporated in site selection was also significant. Finally, incorporating the popularity metric in information fusion resulted in a significant improvement. In summary, the results show that incorporating quality metrics can generally improve search effectiveness in both centralized and distributed search environments.},\n\turldate = {2023-10-13},\n\tbooktitle = {Proceedings of the 23rd annual international {ACM} {SIGIR} conference on {Research} and development in information retrieval},\n\tpublisher = {Association for Computing Machinery},\n\tauthor = {Zhu, Xiaolan and Gauch, Susan},\n\tmonth = jul,\n\tyear = {2000},\n\tkeywords = {4M\\_Data Quality Metrics, 4M\\_Research Data Management},\n\tpages = {288--295},\n}\n\n\n\n","author_short":["Zhu, X.","Gauch, S."],"key":"zhu_incorporating_2000","id":"zhu_incorporating_2000","bibbaseid":"zhu-gauch-incorporatingqualitymetricsincentralizeddistributedinformationretrievalontheworldwideweb-2000","role":"author","urls":{"Paper":"https://dl.acm.org/doi/10.1145/345508.345602"},"keyword":["4M_Data Quality Metrics","4M_Research Data Management"],"metadata":{"authorlinks":{}},"downloads":0,"html":""},"bibtype":"inproceedings","biburl":"https://bibbase.org/zotero-group/anna.pravdyuk/5029944","creationDate":"2020-05-26T09:25:38.361Z","downloads":0,"keywords":["4m_data quality metrics","4m_research data management"],"search_terms":["incorporating","quality","metrics","centralized","distributed","information","retrieval","world","wide","web","zhu","gauch"],"title":"Incorporating quality metrics in centralized/distributed information retrieval on the World Wide Web","year":2000,"dataSources":["h8ZDyzMApwwGDDKcf","opXuHsiAn74kBXyhS","K9mgzhSqtkWBstbEr","qaR8icsrxc59Wdtwb"]}