Named Entity Annotation Projection Applied to Classical Languages. Yousef, T., Palladino, C., Heyer, G., & Jänicke, S. Proceedings of the 7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, 2023.
Paper abstract bibtex In this study, we demonstrate how to apply cross-lingual annotation projection to transfer named-entity annotations to classical languages for which limited or no resources and annotated texts are available, aiming to enrich their NER training datasets and train a model to perform NER tagging. Our approach employs sentence-level aligned corpora of ancient texts and the translation in a modern language, for which high-quality off-the-shelf NER systems are available. We automatically annotate the text of the modern language and employ a stateof-the-art neural word alignment system to find translation equivalents. Finally, we transfer the annotations to the corresponding tokens in the ancient texts using a direct projection heuristic. We applied our method to ancient Greek and Latin using the Bible with the English translation as a parallel corpus. We used the resulting annotations to enhance the performance of an existing NER model for ancient Greek.
@article{yousef_named_2023,
title = {Named {Entity} {Annotation} {Projection} {Applied} to {Classical} {Languages}},
url = {https://aclanthology.org/2023.latechclfl-1.19.pdf},
abstract = {In this study, we demonstrate how to apply cross-lingual annotation projection to transfer named-entity annotations to classical languages for which limited or no resources and annotated texts are available, aiming to enrich their NER training datasets and train a model to perform NER tagging. Our approach employs sentence-level aligned corpora of ancient texts and the translation in a modern language, for which high-quality off-the-shelf NER systems are available. We automatically annotate the text of the modern language and employ a stateof-the-art neural word alignment system to find translation equivalents. Finally, we transfer the annotations to the corresponding tokens in the ancient texts using a direct projection heuristic. We applied our method to ancient Greek and Latin using the Bible with the English translation as a parallel corpus. We used the resulting annotations to enhance the performance of an existing NER model for ancient Greek.},
language = {en},
journal = {Proceedings of the 7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature},
author = {Yousef, Tariq and Palladino, Chiara and Heyer, Gerhard and Jänicke, Stefan},
year = {2023},
pages = {175--182},
}
Downloads: 0
{"_id":"YLExXd42SNdRuetG7","bibbaseid":"yousef-palladino-heyer-jnicke-namedentityannotationprojectionappliedtoclassicallanguages-2023","author_short":["Yousef, T.","Palladino, C.","Heyer, G.","Jänicke, S."],"bibdata":{"bibtype":"article","type":"article","title":"Named Entity Annotation Projection Applied to Classical Languages","url":"https://aclanthology.org/2023.latechclfl-1.19.pdf","abstract":"In this study, we demonstrate how to apply cross-lingual annotation projection to transfer named-entity annotations to classical languages for which limited or no resources and annotated texts are available, aiming to enrich their NER training datasets and train a model to perform NER tagging. Our approach employs sentence-level aligned corpora of ancient texts and the translation in a modern language, for which high-quality off-the-shelf NER systems are available. We automatically annotate the text of the modern language and employ a stateof-the-art neural word alignment system to find translation equivalents. Finally, we transfer the annotations to the corresponding tokens in the ancient texts using a direct projection heuristic. We applied our method to ancient Greek and Latin using the Bible with the English translation as a parallel corpus. We used the resulting annotations to enhance the performance of an existing NER model for ancient Greek.","language":"en","journal":"Proceedings of the 7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature","author":[{"propositions":[],"lastnames":["Yousef"],"firstnames":["Tariq"],"suffixes":[]},{"propositions":[],"lastnames":["Palladino"],"firstnames":["Chiara"],"suffixes":[]},{"propositions":[],"lastnames":["Heyer"],"firstnames":["Gerhard"],"suffixes":[]},{"propositions":[],"lastnames":["Jänicke"],"firstnames":["Stefan"],"suffixes":[]}],"year":"2023","pages":"175–182","bibtex":"@article{yousef_named_2023,\n\ttitle = {Named {Entity} {Annotation} {Projection} {Applied} to {Classical} {Languages}},\n\turl = {https://aclanthology.org/2023.latechclfl-1.19.pdf},\n\tabstract = {In this study, we demonstrate how to apply cross-lingual annotation projection to transfer named-entity annotations to classical languages for which limited or no resources and annotated texts are available, aiming to enrich their NER training datasets and train a model to perform NER tagging. Our approach employs sentence-level aligned corpora of ancient texts and the translation in a modern language, for which high-quality off-the-shelf NER systems are available. We automatically annotate the text of the modern language and employ a stateof-the-art neural word alignment system to find translation equivalents. Finally, we transfer the annotations to the corresponding tokens in the ancient texts using a direct projection heuristic. We applied our method to ancient Greek and Latin using the Bible with the English translation as a parallel corpus. We used the resulting annotations to enhance the performance of an existing NER model for ancient Greek.},\n\tlanguage = {en},\n\tjournal = {Proceedings of the 7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature},\n\tauthor = {Yousef, Tariq and Palladino, Chiara and Heyer, Gerhard and Jänicke, Stefan},\n\tyear = {2023},\n\tpages = {175--182},\n}\n\n\n\n","author_short":["Yousef, T.","Palladino, C.","Heyer, G.","Jänicke, S."],"key":"yousef_named_2023","id":"yousef_named_2023","bibbaseid":"yousef-palladino-heyer-jnicke-namedentityannotationprojectionappliedtoclassicallanguages-2023","role":"author","urls":{"Paper":"https://aclanthology.org/2023.latechclfl-1.19.pdf"},"metadata":{"authorlinks":{}}},"bibtype":"article","biburl":"https://bibbase.org/zotero-group/schulzkx/5158478","dataSources":["JFDnASMkoQCjjGL8E"],"keywords":[],"search_terms":["named","entity","annotation","projection","applied","classical","languages","yousef","palladino","heyer","jänicke"],"title":"Named Entity Annotation Projection Applied to Classical Languages","year":2023}