SEFLAG: Systematic Evaluation Framework for NLP Models and Datasets in Latin and Ancient Greek. Schulz, K. & Deichsler, F. November, 2024.
SEFLAG: Systematic Evaluation Framework for NLP Models and Datasets in Latin and Ancient Greek [link]Paper  doi  abstract   bibtex   1 download  
The poster presents SEFLAG, a Systematic Evaluation Framework for NLP models and datasets in Latin and Ancient Greek, developed at Humboldt-Universität zu Berlin. It addresses three core research questions: helping literary scholars select suitable NLP models, systematically documenting language resources, and unifying similar but distinct annotation schemas. SEFLAG integrates components such as model evaluation, data curation, and documentation, utilizing tools like spaCy, flair, Hugging Face, and Zenodo. It supports tasks including named entity recognition, lemmatization, and dependency parsing. A key feature is mapping between different annotation schemas to ensure comparability across resources. Evaluation metrics (such as F1, accuracy) show performance results for both Latin and Ancient Greek across several models and datasets. Challenges in this domain include linguistic variation, limited resources, interoperability issues, and the need for sustainable, interdisciplinary research. SEFLAG contributes solutions such as publishing model cards and datasheets, using Linked Data for evaluation results, and offering case-specific mappings. Future plans include expanding to more tasks, models, and datasets, creating educational materials on NLP evaluation, and fully integrating the framework into the Daidalos research infrastructure. All resources are open-access, with code and evaluation data available online.
@misc{schulz_seflag_2024,
	title = {{SEFLAG}: {Systematic} {Evaluation} {Framework} for {NLP} {Models} and {Datasets} in {Latin} and {Ancient} {Greek}},
	shorttitle = {{SEFLAG}},
	url = {https://zenodo.org/records/15790925},
	abstract = {The poster presents SEFLAG, a Systematic Evaluation Framework for NLP models and datasets in Latin and Ancient Greek, developed at Humboldt-Universität zu Berlin. It addresses three core research questions: helping literary scholars select suitable NLP models, systematically documenting language resources, and unifying similar but distinct annotation schemas.

SEFLAG integrates components such as model evaluation, data curation, and documentation, utilizing tools like spaCy, flair, Hugging Face, and Zenodo. It supports tasks including named entity recognition, lemmatization, and dependency parsing. A key feature is mapping between different annotation schemas to ensure comparability across resources. Evaluation metrics (such as F1, accuracy) show performance results for both Latin and Ancient Greek across several models and datasets.

Challenges in this domain include linguistic variation, limited resources, interoperability issues, and the need for sustainable, interdisciplinary research. SEFLAG contributes solutions such as publishing model cards and datasheets, using Linked Data for evaluation results, and offering case-specific mappings.

Future plans include expanding to more tasks, models, and datasets, creating educational materials on NLP evaluation, and fully integrating the framework into the Daidalos research infrastructure. All resources are open-access, with code and evaluation data available online.},
	language = {eng},
	urldate = {2025-07-02},
	author = {Schulz, Konstantin and Deichsler, Florian},
	month = nov,
	year = {2024},
	doi = {10.5281/zenodo.15790925},
	keywords = {Artificial Intelligence, Artificial intelligence, Languages and literature, Literature, Literature studies, Literature study, Natural Language Processing, Natural language processing},
}

Downloads: 1