Web-Scale Information Extraction in KnowItAll ( Preliminary Results )

Web-Scale Information Extraction in KnowItAll ( Preliminary Results ). Etzioni, O., Cafarella, M., Downey, D., Kok, S., Popescu, A., Shaked, T., Soderland, S., Weld, D., S., & Yates, A. In Proceedings of the 13th international conference on World Wide Web, of WWW '04, pages 100-110, 2004. ACM Press.

Paper

Web-Scale Information Extraction in KnowItAll ( Preliminary Results ) [link]

Website abstract bibtex

Manually querying search engines in order to accumulate a large body of factual information is a tedious, error-prone process of piecemeal search. Search engines retrieve and rank potentially rel- evant documents for human perusal, but do not extract facts, assess confidence, or fuse information from multiple documents. This pa- per introduces KNOWITALL, a system that aims to automate the tedious process of extracting large collections of facts from the web in an autonomous, domain-independent, and scalable manner. The paper describes preliminary experiments in which an in- stance of KNOWITALL, running for four days on a single machine, was able to automatically extract 54,753 facts. KNOWITALL asso- ciates a probability with each fact enabling it to trade off precision and recall. The paper analyzes KNOWITALLs architecture and re- ports on lessons learned for the design of large-scale information extraction systems.

@inProceedings{
 title = {Web-Scale Information Extraction in KnowItAll ( Preliminary Results )},
 type = {inProceedings},
 year = {2004},
 identifiers = {[object Object]},
 keywords = {information extraction,mutual information,search},
 pages = {100-110},
 websites = {http://portal.acm.org/citation.cfm?id=988687},
 publisher = {ACM Press},
 series = {WWW '04},
 editors = {[object Object],[object Object],[object Object],[object Object]},
 id = {c0175776-d50f-345b-9d9a-dab8f04ee7fd},
 created = {2011-02-24T21:47:51.000Z},
 file_attached = {true},
 profile_id = {5284e6aa-156c-3ce5-bc0e-b80cf09f3ef6},
 group_id = {066b42c8-f712-3fc3-abb2-225c158d2704},
 last_modified = {2017-03-14T14:36:19.698Z},
 read = {false},
 starred = {false},
 authored = {false},
 confirmed = {true},
 hidden = {false},
 citation_key = {Etzioni2004},
 private_publication = {false},
 abstract = {Manually querying search engines in order to accumulate a large body of factual information is a tedious, error-prone process of piecemeal search. Search engines retrieve and rank potentially rel- evant documents for human perusal, but do not extract facts, assess confidence, or fuse information from multiple documents. This pa- per introduces KNOWITALL, a system that aims to automate the tedious process of extracting large collections of facts from the web in an autonomous, domain-independent, and scalable manner. The paper describes preliminary experiments in which an in- stance of KNOWITALL, running for four days on a single machine, was able to automatically extract 54,753 facts. KNOWITALL asso- ciates a probability with each fact enabling it to trade off precision and recall. The paper analyzes KNOWITALLs architecture and re- ports on lessons learned for the design of large-scale information extraction systems.},
 bibtype = {inProceedings},
 author = {Etzioni, Oren and Cafarella, Michael and Downey, Doug and Kok, Stanley and Popescu, Ana-Maria and Shaked, Tal and Soderland, Stephen and Weld, Daniel S and Yates, Alexander},
 booktitle = {Proceedings of the 13th international conference on World Wide Web}
}

Downloads: 0

{"_id":"TkE2cxrrJM4BAZeTy","bibbaseid":"etzioni-cafarella-downey-kok-popescu-shaked-soderland-weld-etal-webscaleinformationextractioninknowitallpreliminaryresults-2004","authorIDs":[],"author_short":["Etzioni, O.","Cafarella, M.","Downey, D.","Kok, S.","Popescu, A.","Shaked, T.","Soderland, S.","Weld, D., S.","Yates, A."],"bibdata":{"title":"Web-Scale Information Extraction in KnowItAll ( Preliminary Results )","type":"inProceedings","year":"2004","identifiers":"[object Object]","keywords":"information extraction,mutual information,search","pages":"100-110","websites":"http://portal.acm.org/citation.cfm?id=988687","publisher":"ACM Press","series":"WWW '04","editors":"[object Object],[object Object],[object Object],[object Object]","id":"c0175776-d50f-345b-9d9a-dab8f04ee7fd","created":"2011-02-24T21:47:51.000Z","file_attached":"true","profile_id":"5284e6aa-156c-3ce5-bc0e-b80cf09f3ef6","group_id":"066b42c8-f712-3fc3-abb2-225c158d2704","last_modified":"2017-03-14T14:36:19.698Z","read":false,"starred":false,"authored":false,"confirmed":"true","hidden":false,"citation_key":"Etzioni2004","private_publication":false,"abstract":"Manually querying search engines in order to accumulate a large body of factual information is a tedious, error-prone process of piecemeal search. Search engines retrieve and rank potentially rel- evant documents for human perusal, but do not extract facts, assess confidence, or fuse information from multiple documents. This pa- per introduces KNOWITALL, a system that aims to automate the tedious process of extracting large collections of facts from the web in an autonomous, domain-independent, and scalable manner. The paper describes preliminary experiments in which an in- stance of KNOWITALL, running for four days on a single machine, was able to automatically extract 54,753 facts. KNOWITALL asso- ciates a probability with each fact enabling it to trade off precision and recall. The paper analyzes KNOWITALLs architecture and re- ports on lessons learned for the design of large-scale information extraction systems.","bibtype":"inProceedings","author":"Etzioni, Oren and Cafarella, Michael and Downey, Doug and Kok, Stanley and Popescu, Ana-Maria and Shaked, Tal and Soderland, Stephen and Weld, Daniel S and Yates, Alexander","booktitle":"Proceedings of the 13th international conference on World Wide Web","bibtex":"@inProceedings{\n title = {Web-Scale Information Extraction in KnowItAll ( Preliminary Results )},\n type = {inProceedings},\n year = {2004},\n identifiers = {[object Object]},\n keywords = {information extraction,mutual information,search},\n pages = {100-110},\n websites = {http://portal.acm.org/citation.cfm?id=988687},\n publisher = {ACM Press},\n series = {WWW '04},\n editors = {[object Object],[object Object],[object Object],[object Object]},\n id = {c0175776-d50f-345b-9d9a-dab8f04ee7fd},\n created = {2011-02-24T21:47:51.000Z},\n file_attached = {true},\n profile_id = {5284e6aa-156c-3ce5-bc0e-b80cf09f3ef6},\n group_id = {066b42c8-f712-3fc3-abb2-225c158d2704},\n last_modified = {2017-03-14T14:36:19.698Z},\n read = {false},\n starred = {false},\n authored = {false},\n confirmed = {true},\n hidden = {false},\n citation_key = {Etzioni2004},\n private_publication = {false},\n abstract = {Manually querying search engines in order to accumulate a large body of factual information is a tedious, error-prone process of piecemeal search. Search engines retrieve and rank potentially rel- evant documents for human perusal, but do not extract facts, assess confidence, or fuse information from multiple documents. This pa- per introduces KNOWITALL, a system that aims to automate the tedious process of extracting large collections of facts from the web in an autonomous, domain-independent, and scalable manner. The paper describes preliminary experiments in which an in- stance of KNOWITALL, running for four days on a single machine, was able to automatically extract 54,753 facts. KNOWITALL asso- ciates a probability with each fact enabling it to trade off precision and recall. The paper analyzes KNOWITALLs architecture and re- ports on lessons learned for the design of large-scale information extraction systems.},\n bibtype = {inProceedings},\n author = {Etzioni, Oren and Cafarella, Michael and Downey, Doug and Kok, Stanley and Popescu, Ana-Maria and Shaked, Tal and Soderland, Stephen and Weld, Daniel S and Yates, Alexander},\n booktitle = {Proceedings of the 13th international conference on World Wide Web}\n}","author_short":["Etzioni, O.","Cafarella, M.","Downey, D.","Kok, S.","Popescu, A.","Shaked, T.","Soderland, S.","Weld, D., S.","Yates, A."],"urls":{"Paper":"https://bibbase.org/service/mendeley/bfdabac2-d7f2-3c5b-aa7a-06431c0ae35e/file/41a22418-d02e-54ee-6874-6cff92b1a844/2004-Web-Scale_Information_Extraction_in_KnowItAll_(_Preliminary_Results_).pdf.pdf","Website":"http://portal.acm.org/citation.cfm?id=988687"},"bibbaseid":"etzioni-cafarella-downey-kok-popescu-shaked-soderland-weld-etal-webscaleinformationextractioninknowitallpreliminaryresults-2004","role":"author","keyword":["information extraction","mutual information","search"],"downloads":0,"html":""},"bibtype":"inProceedings","creationDate":"2020-02-06T23:48:11.720Z","downloads":0,"keywords":["information extraction","mutual information","search"],"search_terms":["web","scale","information","extraction","knowitall","preliminary","results","etzioni","cafarella","downey","kok","popescu","shaked","soderland","weld","yates"],"title":"Web-Scale Information Extraction in KnowItAll ( Preliminary Results )","year":2004}