Technical Standards for Latin and Ancient Greek: Referencing, Evaluation, Documentation

Technical Standards for Latin and Ancient Greek: Referencing, Evaluation, Documentation. Schulz, K. March, 2026.

This pitch argues that rapid advances in AI and NLP for Latin and Ancient Greek have outpaced the development and adoption of shared technical standards. While the field already has relevant frameworks—such as CITE URNs, Canonical Text Services, model cards, datasheets, and platforms like Hugging Face—their implementation is uneven and insufficient for sustainable research. The core question is how to make NLP research for ancient languages more sustainable. Proposed steps include: comprehensive surveys of methods and performance, identifying weaknesses in existing standards or their adoption, developing clear guidelines and best practices, fostering collaboration through workshops, benchmarking, and shared code. Concrete “first issues” include standardizing lemmatization via canonical lemma lists, creating domain-specific annotation guidelines for named entity recognition, and agreeing on canonical encoding and normalization practices. Overall, the presentation calls for coordinated community efforts to align innovation in digital classics with robust referencing, documentation, and evaluation standards.

@misc{schulzTechnicalStandardsLatin2026,
	title = {Technical {Standards} for {Latin} and {Ancient} {Greek}: {Referencing}, {Evaluation}, {Documentation}},
	shorttitle = {Technical {Standards} for {Latin} and {Ancient} {Greek}},
	url = {https://zenodo.org/records/18863970},
	doi = {10.5281/zenodo.18863970},
	abstract = {This pitch argues that rapid advances in AI and NLP for Latin and Ancient Greek have outpaced the development and adoption of shared technical standards.

While the field already has relevant frameworks—such as CITE URNs, Canonical Text Services, model cards, datasheets, and platforms like Hugging Face—their implementation is uneven and insufficient for sustainable research.

The core question is how to make NLP research for ancient languages more sustainable. Proposed steps include:





comprehensive surveys of methods and performance,




identifying weaknesses in existing standards or their adoption,




developing clear guidelines and best practices,




fostering collaboration through workshops, benchmarking, and shared code.



Concrete “first issues” include standardizing lemmatization via canonical lemma lists, creating domain-specific annotation guidelines for named entity recognition, and agreeing on canonical encoding and normalization practices.

Overall, the presentation calls for coordinated community efforts to align innovation in digital classics with robust referencing, documentation, and evaluation standards.},
	language = {eng},
	urldate = {2026-03-04},
	author = {Schulz, Konstantin},
	month = mar,
	year = {2026},
	keywords = {Classics, International standardisation, Natural Language Processing, Natural language processing, Standardisation},
}

Downloads: 1

{"_id":"aJukMbFf9aSCBHaho","bibbaseid":"schulz-technicalstandardsforlatinandancientgreekreferencingevaluationdocumentation-2026","author_short":["Schulz, K."],"bibdata":{"bibtype":"misc","type":"misc","title":"Technical Standards for Latin and Ancient Greek: Referencing, Evaluation, Documentation","shorttitle":"Technical Standards for Latin and Ancient Greek","url":"https://zenodo.org/records/18863970","doi":"10.5281/zenodo.18863970","abstract":"This pitch argues that rapid advances in AI and NLP for Latin and Ancient Greek have outpaced the development and adoption of shared technical standards. While the field already has relevant frameworks—such as CITE URNs, Canonical Text Services, model cards, datasheets, and platforms like Hugging Face—their implementation is uneven and insufficient for sustainable research. The core question is how to make NLP research for ancient languages more sustainable. Proposed steps include: comprehensive surveys of methods and performance, identifying weaknesses in existing standards or their adoption, developing clear guidelines and best practices, fostering collaboration through workshops, benchmarking, and shared code. Concrete “first issues” include standardizing lemmatization via canonical lemma lists, creating domain-specific annotation guidelines for named entity recognition, and agreeing on canonical encoding and normalization practices. Overall, the presentation calls for coordinated community efforts to align innovation in digital classics with robust referencing, documentation, and evaluation standards.","language":"eng","urldate":"2026-03-04","author":[{"propositions":[],"lastnames":["Schulz"],"firstnames":["Konstantin"],"suffixes":[]}],"month":"March","year":"2026","keywords":"Classics, International standardisation, Natural Language Processing, Natural language processing, Standardisation","bibtex":"@misc{schulzTechnicalStandardsLatin2026,\n\ttitle = {Technical {Standards} for {Latin} and {Ancient} {Greek}: {Referencing}, {Evaluation}, {Documentation}},\n\tshorttitle = {Technical {Standards} for {Latin} and {Ancient} {Greek}},\n\turl = {https://zenodo.org/records/18863970},\n\tdoi = {10.5281/zenodo.18863970},\n\tabstract = {This pitch argues that rapid advances in AI and NLP for Latin and Ancient Greek have outpaced the development and adoption of shared technical standards.\n\nWhile the field already has relevant frameworks—such as CITE URNs, Canonical Text Services, model cards, datasheets, and platforms like Hugging Face—their implementation is uneven and insufficient for sustainable research.\n\nThe core question is how to make NLP research for ancient languages more sustainable. Proposed steps include:\n\n\n\n\n\ncomprehensive surveys of methods and performance,\n\n\n\n\nidentifying weaknesses in existing standards or their adoption,\n\n\n\n\ndeveloping clear guidelines and best practices,\n\n\n\n\nfostering collaboration through workshops, benchmarking, and shared code.\n\n\n\nConcrete “first issues” include standardizing lemmatization via canonical lemma lists, creating domain-specific annotation guidelines for named entity recognition, and agreeing on canonical encoding and normalization practices.\n\nOverall, the presentation calls for coordinated community efforts to align innovation in digital classics with robust referencing, documentation, and evaluation standards.},\n\tlanguage = {eng},\n\turldate = {2026-03-04},\n\tauthor = {Schulz, Konstantin},\n\tmonth = mar,\n\tyear = {2026},\n\tkeywords = {Classics, International standardisation, Natural Language Processing, Natural language processing, Standardisation},\n}\n\n","author_short":["Schulz, K."],"key":"schulzTechnicalStandardsLatin2026","id":"schulzTechnicalStandardsLatin2026","bibbaseid":"schulz-technicalstandardsforlatinandancientgreekreferencingevaluationdocumentation-2026","role":"author","urls":{"Paper":"https://zenodo.org/records/18863970"},"keyword":["Classics","International standardisation","Natural Language Processing","Natural language processing","Standardisation"],"metadata":{"authorlinks":{}},"downloads":1},"bibtype":"misc","biburl":"https://api.zotero.org/users/912485/collections/VL2FMYT9/items?key=tJHJRX0dfdnwe3iuxbtyH4ht&format=bibtex&limit=100","dataSources":["JFDnASMkoQCjjGL8E","nsAxma8Gv9c6fai5n"],"keywords":["classics","international standardisation","natural language processing","natural language processing","standardisation"],"search_terms":["technical","standards","latin","ancient","greek","referencing","evaluation","documentation","schulz"],"title":"Technical Standards for Latin and Ancient Greek: Referencing, Evaluation, Documentation","year":2026,"downloads":1}