Adapting transformer models to morphological tagging of two highly inflectional languages: a case study on Ancient Greek and Latin. Keersmaekers, A. & Mercelis, W. Proceedings of the 1st Workshop on Machine Learning for Ancient Languages (ML4AL 2024), 2024. Conference Name: Proceedings of the 1st Workshop on Machine Learning for Ancient Languages (ML4AL 2024) Place: Hybrid in Bangkok, Thailand and online Publisher: Association for Computational Linguistics
Adapting transformer models to morphological tagging of two highly inflectional languages: a case study on Ancient Greek and Latin [link]Paper  doi  abstract   bibtex   
Natural language processing for Greek and Latin, inflectional languages with small corpora, requires special techniques. For morphological tagging, transformer models show promising potential, but the best approach to use these models is unclear. For both languages, this paper examines the impact of using morphological lexica, training different model types (a single model with a combined feature tag, multiple models for separate features, and a multi-task model for all features), and adding linguistic constraints. We find that, although simply fine-tuning transformers to predict a monolithic tag may already yield decent results, each of these adaptations can further improve tagging accuracy.
@article{keersmaekers_adapting_2024,
	title = {Adapting transformer models to morphological tagging of two highly inflectional languages: a case study on {Ancient} {Greek} and {Latin}},
	shorttitle = {Adapting transformer models to morphological tagging of two highly inflectional languages},
	url = {https://aclanthology.org/2024.ml4al-1.17},
	doi = {10.18653/v1/2024.ml4al-1.17},
	abstract = {Natural language processing for Greek and Latin, inflectional languages with small corpora, requires special techniques. For morphological tagging, transformer models show promising potential, but the best approach to use these models is unclear. For both languages, this paper examines the impact of using morphological lexica, training different model types (a single model with a combined feature tag, multiple models for separate features, and a multi-task model for all features), and adding linguistic constraints. We find that, although simply fine-tuning transformers to predict a monolithic tag may already yield decent results, each of these adaptations can further improve tagging accuracy.},
	language = {en},
	urldate = {2025-01-26},
	journal = {Proceedings of the 1st Workshop on Machine Learning for Ancient Languages (ML4AL 2024)},
	author = {Keersmaekers, Alek and Mercelis, Wouter},
	year = {2024},
	note = {Conference Name: Proceedings of the 1st Workshop on Machine Learning for Ancient Languages (ML4AL 2024)
Place: Hybrid in Bangkok, Thailand and online
Publisher: Association for Computational Linguistics},
	pages = {165--176},
}

Downloads: 0