Cross-linguistic annotation transfer in geoparsing experiments with Classical texts. Soffiantini, L. DH Benelux Journal, 6:155–168, 2024.
Cross-linguistic annotation transfer in geoparsing experiments with Classical texts [pdf]Paper  abstract   bibtex   
The Natural History is an encyclopedic work written by the Latin author Pliny the Elder (first century CE). In this extensive text in 37 books, geography plays a pivotal role, with hundreds of mentions of ancient place names. In this paper, a geoparsing experiment is conducted on the Natural History with the scope of automatically identifying and extracting place entities. To achieve this, we take advantage of state-of-the-art NLP models to develop a multistage pipeline involving English Named Entity Recognition, English-Latin sentence alignment, and entity projection. The paper demonstrates how cross-lingual annotation transfer can be applied from a translation in a modern language back to the original text in the context of low-/medium-resource languages, such as Latin. The efficacy of the proposed pipeline is evaluated through the use of both standard metrics and a comprehensive manual error analysis. Additionally, the results are compared to those obtained by other Latin NER tools. Both analyses demonstrate that the proposed methodology achieves a superior f1-score. Finally, the majority of place entities were automatically associated with unique identifiers that enable geolocation by the projection of pre-disambiguated annotations derived from another geo-spatial project.

Downloads: 0