Web-Scale Information Extraction in KnowItAll ( Preliminary Results ). Etzioni, O., Cafarella, M., Downey, D., Kok, S., Popescu, A., Shaked, T., Soderland, S., Weld, D., S., & Yates, A. In Proceedings of the 13th international conference on World Wide Web, of WWW '04, pages 100-110, 2004. ACM Press.
Paper
Website abstract bibtex Manually querying search engines in order to accumulate a large body of factual information is a tedious, error-prone process of piecemeal search. Search engines retrieve and rank potentially rel- evant documents for human perusal, but do not extract facts, assess confidence, or fuse information from multiple documents. This pa- per introduces KNOWITALL, a system that aims to automate the tedious process of extracting large collections of facts from the web in an autonomous, domain-independent, and scalable manner. The paper describes preliminary experiments in which an in- stance of KNOWITALL, running for four days on a single machine, was able to automatically extract 54,753 facts. KNOWITALL asso- ciates a probability with each fact enabling it to trade off precision and recall. The paper analyzes KNOWITALLs architecture and re- ports on lessons learned for the design of large-scale information extraction systems.
@inProceedings{
title = {Web-Scale Information Extraction in KnowItAll ( Preliminary Results )},
type = {inProceedings},
year = {2004},
identifiers = {[object Object]},
keywords = {information extraction,mutual information,search},
pages = {100-110},
websites = {http://portal.acm.org/citation.cfm?id=988687},
publisher = {ACM Press},
series = {WWW '04},
editors = {[object Object],[object Object],[object Object],[object Object]},
id = {c0175776-d50f-345b-9d9a-dab8f04ee7fd},
created = {2011-02-24T21:47:51.000Z},
file_attached = {true},
profile_id = {5284e6aa-156c-3ce5-bc0e-b80cf09f3ef6},
group_id = {066b42c8-f712-3fc3-abb2-225c158d2704},
last_modified = {2017-03-14T14:36:19.698Z},
read = {false},
starred = {false},
authored = {false},
confirmed = {true},
hidden = {false},
citation_key = {Etzioni2004},
private_publication = {false},
abstract = {Manually querying search engines in order to accumulate a large body of factual information is a tedious, error-prone process of piecemeal search. Search engines retrieve and rank potentially rel- evant documents for human perusal, but do not extract facts, assess confidence, or fuse information from multiple documents. This pa- per introduces KNOWITALL, a system that aims to automate the tedious process of extracting large collections of facts from the web in an autonomous, domain-independent, and scalable manner. The paper describes preliminary experiments in which an in- stance of KNOWITALL, running for four days on a single machine, was able to automatically extract 54,753 facts. KNOWITALL asso- ciates a probability with each fact enabling it to trade off precision and recall. The paper analyzes KNOWITALLs architecture and re- ports on lessons learned for the design of large-scale information extraction systems.},
bibtype = {inProceedings},
author = {Etzioni, Oren and Cafarella, Michael and Downey, Doug and Kok, Stanley and Popescu, Ana-Maria and Shaked, Tal and Soderland, Stephen and Weld, Daniel S and Yates, Alexander},
booktitle = {Proceedings of the 13th international conference on World Wide Web}
}
Downloads: 0
{"_id":"TkE2cxrrJM4BAZeTy","bibbaseid":"etzioni-cafarella-downey-kok-popescu-shaked-soderland-weld-etal-webscaleinformationextractioninknowitallpreliminaryresults-2004","authorIDs":[],"author_short":["Etzioni, O.","Cafarella, M.","Downey, D.","Kok, S.","Popescu, A.","Shaked, T.","Soderland, S.","Weld, D., S.","Yates, A."],"bibdata":{"title":"Web-Scale Information Extraction in KnowItAll ( Preliminary Results )","type":"inProceedings","year":"2004","identifiers":"[object Object]","keywords":"information extraction,mutual information,search","pages":"100-110","websites":"http://portal.acm.org/citation.cfm?id=988687","publisher":"ACM Press","series":"WWW '04","editors":"[object Object],[object Object],[object Object],[object Object]","id":"c0175776-d50f-345b-9d9a-dab8f04ee7fd","created":"2011-02-24T21:47:51.000Z","file_attached":"true","profile_id":"5284e6aa-156c-3ce5-bc0e-b80cf09f3ef6","group_id":"066b42c8-f712-3fc3-abb2-225c158d2704","last_modified":"2017-03-14T14:36:19.698Z","read":false,"starred":false,"authored":false,"confirmed":"true","hidden":false,"citation_key":"Etzioni2004","private_publication":false,"abstract":"Manually querying search engines in order to accumulate a large body of factual information is a tedious, error-prone process of piecemeal search. Search engines retrieve and rank potentially rel- evant documents for human perusal, but do not extract facts, assess confidence, or fuse information from multiple documents. This pa- per introduces KNOWITALL, a system that aims to automate the tedious process of extracting large collections of facts from the web in an autonomous, domain-independent, and scalable manner. The paper describes preliminary experiments in which an in- stance of KNOWITALL, running for four days on a single machine, was able to automatically extract 54,753 facts. KNOWITALL asso- ciates a probability with each fact enabling it to trade off precision and recall. The paper analyzes KNOWITALLs architecture and re- ports on lessons learned for the design of large-scale information extraction systems.","bibtype":"inProceedings","author":"Etzioni, Oren and Cafarella, Michael and Downey, Doug and Kok, Stanley and Popescu, Ana-Maria and Shaked, Tal and Soderland, Stephen and Weld, Daniel S and Yates, Alexander","booktitle":"Proceedings of the 13th international conference on World Wide Web","bibtex":"@inProceedings{\n title = {Web-Scale Information Extraction in KnowItAll ( Preliminary Results )},\n type = {inProceedings},\n year = {2004},\n identifiers = {[object Object]},\n keywords = {information extraction,mutual information,search},\n pages = {100-110},\n websites = {http://portal.acm.org/citation.cfm?id=988687},\n publisher = {ACM Press},\n series = {WWW '04},\n editors = {[object Object],[object Object],[object Object],[object Object]},\n id = {c0175776-d50f-345b-9d9a-dab8f04ee7fd},\n created = {2011-02-24T21:47:51.000Z},\n file_attached = {true},\n profile_id = {5284e6aa-156c-3ce5-bc0e-b80cf09f3ef6},\n group_id = {066b42c8-f712-3fc3-abb2-225c158d2704},\n last_modified = {2017-03-14T14:36:19.698Z},\n read = {false},\n starred = {false},\n authored = {false},\n confirmed = {true},\n hidden = {false},\n citation_key = {Etzioni2004},\n private_publication = {false},\n abstract = {Manually querying search engines in order to accumulate a large body of factual information is a tedious, error-prone process of piecemeal search. Search engines retrieve and rank potentially rel- evant documents for human perusal, but do not extract facts, assess confidence, or fuse information from multiple documents. This pa- per introduces KNOWITALL, a system that aims to automate the tedious process of extracting large collections of facts from the web in an autonomous, domain-independent, and scalable manner. The paper describes preliminary experiments in which an in- stance of KNOWITALL, running for four days on a single machine, was able to automatically extract 54,753 facts. KNOWITALL asso- ciates a probability with each fact enabling it to trade off precision and recall. The paper analyzes KNOWITALLs architecture and re- ports on lessons learned for the design of large-scale information extraction systems.},\n bibtype = {inProceedings},\n author = {Etzioni, Oren and Cafarella, Michael and Downey, Doug and Kok, Stanley and Popescu, Ana-Maria and Shaked, Tal and Soderland, Stephen and Weld, Daniel S and Yates, Alexander},\n booktitle = {Proceedings of the 13th international conference on World Wide Web}\n}","author_short":["Etzioni, O.","Cafarella, M.","Downey, D.","Kok, S.","Popescu, A.","Shaked, T.","Soderland, S.","Weld, D., S.","Yates, A."],"urls":{"Paper":"https://bibbase.org/service/mendeley/bfdabac2-d7f2-3c5b-aa7a-06431c0ae35e/file/41a22418-d02e-54ee-6874-6cff92b1a844/2004-Web-Scale_Information_Extraction_in_KnowItAll_(_Preliminary_Results_).pdf.pdf","Website":"http://portal.acm.org/citation.cfm?id=988687"},"bibbaseid":"etzioni-cafarella-downey-kok-popescu-shaked-soderland-weld-etal-webscaleinformationextractioninknowitallpreliminaryresults-2004","role":"author","keyword":["information extraction","mutual information","search"],"downloads":0,"html":""},"bibtype":"inProceedings","creationDate":"2020-02-06T23:48:11.720Z","downloads":0,"keywords":["information extraction","mutual information","search"],"search_terms":["web","scale","information","extraction","knowitall","preliminary","results","etzioni","cafarella","downey","kok","popescu","shaked","soderland","weld","yates"],"title":"Web-Scale Information Extraction in KnowItAll ( Preliminary Results )","year":2004}