Introducing the CLEF 2020 HIPE Shared Task: Named Entity Recognition and Linking on Historical Newspapers

Introducing the CLEF 2020 HIPE Shared Task: Named Entity Recognition and Linking on Historical Newspapers. Ehrmann, M., Romanello, M., Bircher, S., & Clematide, S. In Jose, J. M., Yilmaz, E., Magalhães, J., Castells, P., Ferro, N., Silva, M. J., & Martins, F., editors, Advances in Information Retrieval, of Lecture Notes in Computer Science, pages 524–532, Cham, 2020. Springer International Publishing.
doi abstract bibtex

Since its introduction some twenty years ago, named entity (NE) processing has become an essential component of virtually any text mining application and has undergone major changes. Recently, two main trends characterise its developments: the adoption of deep learning architectures and the consideration of textual material originating from historical and cultural heritage collections. While the former opens up new opportunities, the latter introduces new challenges with heterogeneous, historical and noisy inputs. If NE processing tools are increasingly being used in the context of historical documents, performance values are below the ones on contemporary data and are hardly comparable. In this context, this paper introduces the CLEF 2020 Evaluation Lab HIPE (Identifying Historical People, Places and other Entities) on named entity recognition and linking on diachronic historical newspaper material in French, German and English. Our objective is threefold: strengthening the robustness of existing approaches on non-standard inputs, enabling performance comparison of NE processing on historical texts, and, in the long run, fostering efficient semantic indexing of historical documents in order to support scholarship on digital cultural heritage collections.

@inproceedings{ehrmann_introducing_2020,
	address = {Cham},
	series = {Lecture {Notes} in {Computer} {Science}},
	title = {Introducing the {CLEF} 2020 {HIPE} {Shared} {Task}: {Named} {Entity} {Recognition} and {Linking} on {Historical} {Newspapers}},
	isbn = {978-3-030-45442-5},
	shorttitle = {Introducing the {CLEF} 2020 {HIPE} {Shared} {Task}},
	doi = {10.1007/978-3-030-45442-5_68},
	abstract = {Since its introduction some twenty years ago, named entity (NE) processing has become an essential component of virtually any text mining application and has undergone major changes. Recently, two main trends characterise its developments: the adoption of deep learning architectures and the consideration of textual material originating from historical and cultural heritage collections. While the former opens up new opportunities, the latter introduces new challenges with heterogeneous, historical and noisy inputs. If NE processing tools are increasingly being used in the context of historical documents, performance values are below the ones on contemporary data and are hardly comparable. In this context, this paper introduces the CLEF 2020 Evaluation Lab HIPE (Identifying Historical People, Places and other Entities) on named entity recognition and linking on diachronic historical newspaper material in French, German and English. Our objective is threefold: strengthening the robustness of existing approaches on non-standard inputs, enabling performance comparison of NE processing on historical texts, and, in the long run, fostering efficient semantic indexing of historical documents in order to support scholarship on digital cultural heritage collections.},
	language = {en},
	booktitle = {Advances in {Information} {Retrieval}},
	publisher = {Springer International Publishing},
	author = {Ehrmann, Maud and Romanello, Matteo and Bircher, Stefan and Clematide, Simon},
	editor = {Jose, Joemon M. and Yilmaz, Emine and Magalhães, João and Castells, Pablo and Ferro, Nicola and Silva, Mário J. and Martins, Flávio},
	year = {2020},
	keywords = {Digital Humanities, Historical newspapers, Information extraction, Named entity processing, Text understanding},
	pages = {524--532},
}

Downloads: 0

{"_id":"v8mauZ5LSAvq5Ztei","bibbaseid":"ehrmann-romanello-bircher-clematide-introducingtheclef2020hipesharedtasknamedentityrecognitionandlinkingonhistoricalnewspapers-2020","authorIDs":[],"author_short":["Ehrmann, M.","Romanello, M.","Bircher, S.","Clematide, S."],"bibdata":{"bibtype":"inproceedings","type":"inproceedings","address":"Cham","series":"Lecture Notes in Computer Science","title":"Introducing the CLEF 2020 HIPE Shared Task: Named Entity Recognition and Linking on Historical Newspapers","isbn":"978-3-030-45442-5","shorttitle":"Introducing the CLEF 2020 HIPE Shared Task","doi":"10.1007/978-3-030-45442-5_68","abstract":"Since its introduction some twenty years ago, named entity (NE) processing has become an essential component of virtually any text mining application and has undergone major changes. Recently, two main trends characterise its developments: the adoption of deep learning architectures and the consideration of textual material originating from historical and cultural heritage collections. While the former opens up new opportunities, the latter introduces new challenges with heterogeneous, historical and noisy inputs. If NE processing tools are increasingly being used in the context of historical documents, performance values are below the ones on contemporary data and are hardly comparable. In this context, this paper introduces the CLEF 2020 Evaluation Lab HIPE (Identifying Historical People, Places and other Entities) on named entity recognition and linking on diachronic historical newspaper material in French, German and English. Our objective is threefold: strengthening the robustness of existing approaches on non-standard inputs, enabling performance comparison of NE processing on historical texts, and, in the long run, fostering efficient semantic indexing of historical documents in order to support scholarship on digital cultural heritage collections.","language":"en","booktitle":"Advances in Information Retrieval","publisher":"Springer International Publishing","author":[{"propositions":[],"lastnames":["Ehrmann"],"firstnames":["Maud"],"suffixes":[]},{"propositions":[],"lastnames":["Romanello"],"firstnames":["Matteo"],"suffixes":[]},{"propositions":[],"lastnames":["Bircher"],"firstnames":["Stefan"],"suffixes":[]},{"propositions":[],"lastnames":["Clematide"],"firstnames":["Simon"],"suffixes":[]}],"editor":[{"propositions":[],"lastnames":["Jose"],"firstnames":["Joemon","M."],"suffixes":[]},{"propositions":[],"lastnames":["Yilmaz"],"firstnames":["Emine"],"suffixes":[]},{"propositions":[],"lastnames":["Magalhães"],"firstnames":["João"],"suffixes":[]},{"propositions":[],"lastnames":["Castells"],"firstnames":["Pablo"],"suffixes":[]},{"propositions":[],"lastnames":["Ferro"],"firstnames":["Nicola"],"suffixes":[]},{"propositions":[],"lastnames":["Silva"],"firstnames":["Mário","J."],"suffixes":[]},{"propositions":[],"lastnames":["Martins"],"firstnames":["Flávio"],"suffixes":[]}],"year":"2020","keywords":"Digital Humanities, Historical newspapers, Information extraction, Named entity processing, Text understanding","pages":"524–532","bibtex":"@inproceedings{ehrmann_introducing_2020,\n\taddress = {Cham},\n\tseries = {Lecture {Notes} in {Computer} {Science}},\n\ttitle = {Introducing the {CLEF} 2020 {HIPE} {Shared} {Task}: {Named} {Entity} {Recognition} and {Linking} on {Historical} {Newspapers}},\n\tisbn = {978-3-030-45442-5},\n\tshorttitle = {Introducing the {CLEF} 2020 {HIPE} {Shared} {Task}},\n\tdoi = {10.1007/978-3-030-45442-5_68},\n\tabstract = {Since its introduction some twenty years ago, named entity (NE) processing has become an essential component of virtually any text mining application and has undergone major changes. Recently, two main trends characterise its developments: the adoption of deep learning architectures and the consideration of textual material originating from historical and cultural heritage collections. While the former opens up new opportunities, the latter introduces new challenges with heterogeneous, historical and noisy inputs. If NE processing tools are increasingly being used in the context of historical documents, performance values are below the ones on contemporary data and are hardly comparable. In this context, this paper introduces the CLEF 2020 Evaluation Lab HIPE (Identifying Historical People, Places and other Entities) on named entity recognition and linking on diachronic historical newspaper material in French, German and English. Our objective is threefold: strengthening the robustness of existing approaches on non-standard inputs, enabling performance comparison of NE processing on historical texts, and, in the long run, fostering efficient semantic indexing of historical documents in order to support scholarship on digital cultural heritage collections.},\n\tlanguage = {en},\n\tbooktitle = {Advances in {Information} {Retrieval}},\n\tpublisher = {Springer International Publishing},\n\tauthor = {Ehrmann, Maud and Romanello, Matteo and Bircher, Stefan and Clematide, Simon},\n\teditor = {Jose, Joemon M. and Yilmaz, Emine and Magalhães, João and Castells, Pablo and Ferro, Nicola and Silva, Mário J. and Martins, Flávio},\n\tyear = {2020},\n\tkeywords = {Digital Humanities, Historical newspapers, Information extraction, Named entity processing, Text understanding},\n\tpages = {524--532},\n}\n\n","author_short":["Ehrmann, M.","Romanello, M.","Bircher, S.","Clematide, S."],"editor_short":["Jose, J. M.","Yilmaz, E.","Magalhães, J.","Castells, P.","Ferro, N.","Silva, M. J.","Martins, F."],"key":"ehrmann_introducing_2020","id":"ehrmann_introducing_2020","bibbaseid":"ehrmann-romanello-bircher-clematide-introducingtheclef2020hipesharedtasknamedentityrecognitionandlinkingonhistoricalnewspapers-2020","role":"author","urls":{},"keyword":["Digital Humanities","Historical newspapers","Information extraction","Named entity processing","Text understanding"],"metadata":{"authorlinks":{}},"downloads":0},"bibtype":"inproceedings","biburl":"https://api.zotero.org/groups/2386895/collections/SZEZ84HD/items?format=bibtex&limit=100","creationDate":"2021-01-26T14:28:50.105Z","downloads":0,"keywords":["digital humanities","historical newspapers","information extraction","named entity processing","text understanding"],"search_terms":["introducing","clef","2020","hipe","shared","task","named","entity","recognition","linking","historical","newspapers","ehrmann","romanello","bircher","clematide"],"title":"Introducing the CLEF 2020 HIPE Shared Task: Named Entity Recognition and Linking on Historical Newspapers","year":2020,"dataSources":["wPWgDzyxsGksjg6mb","9NEHEPxnGsECrWvrT"]}