POSTagging and Semantic Dictionary Creation for Hittite Cuneiform. Homburg, T. In Lewis, R., Raynor, C., Forest, D., Sinatra, M., & Sinclair, S., editors, Digital Humanities 2017, DH 2017, Conference Abstracts, McGill University & Université de Montréal, Montréal, Canada, August 8-11, 2017, Montréal, Canada, August, 2017. Alliance of Digital Humanities Organizations, Alliance of Digital Humanities Organizations (ADHO).
Paper abstract bibtex 1 download Presentation Topic and State Of The Art On our poster we want to present ongoing work to create an automatic natural language processing tool for Hittite cuneiform. Hittite cuneiform texts are to this day manually transcribed by the respective experts and then published in a transliteration format (commonly ATF). Pictures of the original cuneiform tablet may be provided and more rarely cuneiform representations in Unicode are present. Due to recent advancements in the field (such as Cuneify) an automatic translation of many Hittite cuneiform transliterations to their respective cuneiform representation is possible. Research Contributions We build upon this work by creating tools that aim to automatically translate Hittite cuneiform texts to English from either a Unicode cuneiform representation or their transliteration representation. POSTagger We have created a morphological analyzer to detect nouns, verbs, several kinds of pronouns, their respective declinations and appendices as well as structural particles. On a sample set of annotated Hittite texts from different epochs in cuneiform and transliteration representation we have evaluated the morphological analyzer, its advantages, problems and possible solutions and intend to present the results as well as some POSTagging examples in section one of our poster. Dictionary Creation Dictionaries for Hittite cuneiform exist in often non-machine readable formats and without a connection to Semantic Web concepts. We intend to change this situation by parsing digitally available nonsemantic dictionaries and using matching algorithms to find concepts of the English translations of such dictionaries in the Semantic Web e.g. DBPedia or Wikidata. Dictionaries of this kind are stored using the Lexical Model for Ontologies (Lemon). In addition to freely available dictionaries we intend to use expert resources developed by the academy of sciences in Mainz/Germany to verify and extend our generated dictionaries. We intend to present the dictionary creation process, statistics about the content of generated dictionaries and their impact in section two of our poster. Machine Translation Using the newly created dictionaries as well as the POSTagging information we intend to test several automated machine translation approaches of which we will outline the process and possible approaches in poster section three. Contributions for the Communities With our approaches we intend to contribute to the archaeological community in Germany by analysing Hittite cuneiform tablets. Together with work from the University of Heidelberg on image recognition of cuneiform tablets, we want to focus on creating a natural language processing pipeline from scanning cuneiform tablets to an available translation in English.
@inproceedings{homburg2017postagging,
title = {POSTagging and Semantic Dictionary Creation for Hittite Cuneiform},
author = {Homburg, Timo},
year = 2017,
month = aug,
day = 9,
booktitle = {Digital Humanities 2017, {DH} 2017, Conference Abstracts, McGill University {\&} Universit{\'{e}} de Montr{\'{e}}al, Montr{\'{e}}al, Canada, August 8-11, 2017},
publisher = {Alliance of Digital Humanities Organizations {(ADHO)}},
address = {Montréal, Canada},
url = {https://dh2017.adho.org/abstracts/139/139.pdf},
abstract = {Presentation Topic and State Of The Art On our poster we want to present ongoing work to create an automatic natural language processing tool for Hittite cuneiform. Hittite cuneiform texts are to this day manually transcribed by the respective experts and then published in a transliteration format (commonly ATF). Pictures of the original cuneiform tablet may be provided and more rarely cuneiform representations in Unicode are present. Due to recent advancements in the field (such as Cuneify) an automatic translation of many Hittite cuneiform transliterations to their respective cuneiform representation is possible. Research Contributions We build upon this work by creating tools that aim to automatically translate Hittite cuneiform texts to English from either a Unicode cuneiform representation or their transliteration representation. POSTagger We have created a morphological analyzer to detect nouns, verbs, several kinds of pronouns, their respective declinations and appendices as well as structural particles. On a sample set of annotated Hittite texts from different epochs in cuneiform and transliteration representation we have evaluated the morphological analyzer, its advantages, problems and possible solutions and intend to present the results as well as some POSTagging examples in section one of our poster. Dictionary Creation Dictionaries for Hittite cuneiform exist in often non-machine readable formats and without a connection to Semantic Web concepts. We intend to change this situation by parsing digitally available nonsemantic dictionaries and using matching algorithms to find concepts of the English translations of such dictionaries in the Semantic Web e.g. DBPedia or Wikidata. Dictionaries of this kind are stored using the Lexical Model for Ontologies (Lemon). In addition to freely available dictionaries we intend to use expert resources developed by the academy of sciences in Mainz/Germany to verify and extend our generated dictionaries. We intend to present the dictionary creation process, statistics about the content of generated dictionaries and their impact in section two of our poster. Machine Translation Using the newly created dictionaries as well as the POSTagging information we intend to test several automated machine translation approaches of which we will outline the process and possible approaches in poster section three. Contributions for the Communities With our approaches we intend to contribute to the archaeological community in Germany by analysing Hittite cuneiform tablets. Together with work from the University of Heidelberg on image recognition of cuneiform tablets, we want to focus on creating a natural language processing pipeline from scanning cuneiform tablets to an available translation in English.},
language = {english},
editor = {Rhian Lewis and Cecily Raynor and Dominic Forest and Michael Sinatra and St{\'{e}}fan Sinclair},
organization = {Alliance of Digital Humanities Organizations},
keywords = {Hittite, Cuneiform, Dictionary, POSTagging, Semantic Web}
}
Downloads: 1
{"_id":"n7qGy96Zfmj42mSK2","bibbaseid":"homburg-postaggingandsemanticdictionarycreationforhittitecuneiform-2017","downloads":1,"creationDate":"2017-09-06T21:38:58.996Z","title":"POSTagging and Semantic Dictionary Creation for Hittite Cuneiform","author_short":["Homburg, T."],"year":2017,"bibtype":"inproceedings","biburl":"https://situx.github.io/files/mypubs.bib","bibdata":{"bibtype":"inproceedings","type":"inproceedings","title":"POSTagging and Semantic Dictionary Creation for Hittite Cuneiform","author":[{"propositions":[],"lastnames":["Homburg"],"firstnames":["Timo"],"suffixes":[]}],"year":"2017","month":"August","day":"9","booktitle":"Digital Humanities 2017, DH 2017, Conference Abstracts, McGill University & Université de Montréal, Montréal, Canada, August 8-11, 2017","publisher":"Alliance of Digital Humanities Organizations (ADHO)","address":"Montréal, Canada","url":"https://dh2017.adho.org/abstracts/139/139.pdf","abstract":"Presentation Topic and State Of The Art On our poster we want to present ongoing work to create an automatic natural language processing tool for Hittite cuneiform. Hittite cuneiform texts are to this day manually transcribed by the respective experts and then published in a transliteration format (commonly ATF). Pictures of the original cuneiform tablet may be provided and more rarely cuneiform representations in Unicode are present. Due to recent advancements in the field (such as Cuneify) an automatic translation of many Hittite cuneiform transliterations to their respective cuneiform representation is possible. Research Contributions We build upon this work by creating tools that aim to automatically translate Hittite cuneiform texts to English from either a Unicode cuneiform representation or their transliteration representation. POSTagger We have created a morphological analyzer to detect nouns, verbs, several kinds of pronouns, their respective declinations and appendices as well as structural particles. On a sample set of annotated Hittite texts from different epochs in cuneiform and transliteration representation we have evaluated the morphological analyzer, its advantages, problems and possible solutions and intend to present the results as well as some POSTagging examples in section one of our poster. Dictionary Creation Dictionaries for Hittite cuneiform exist in often non-machine readable formats and without a connection to Semantic Web concepts. We intend to change this situation by parsing digitally available nonsemantic dictionaries and using matching algorithms to find concepts of the English translations of such dictionaries in the Semantic Web e.g. DBPedia or Wikidata. Dictionaries of this kind are stored using the Lexical Model for Ontologies (Lemon). In addition to freely available dictionaries we intend to use expert resources developed by the academy of sciences in Mainz/Germany to verify and extend our generated dictionaries. We intend to present the dictionary creation process, statistics about the content of generated dictionaries and their impact in section two of our poster. Machine Translation Using the newly created dictionaries as well as the POSTagging information we intend to test several automated machine translation approaches of which we will outline the process and possible approaches in poster section three. Contributions for the Communities With our approaches we intend to contribute to the archaeological community in Germany by analysing Hittite cuneiform tablets. Together with work from the University of Heidelberg on image recognition of cuneiform tablets, we want to focus on creating a natural language processing pipeline from scanning cuneiform tablets to an available translation in English.","language":"english","editor":[{"firstnames":["Rhian"],"propositions":[],"lastnames":["Lewis"],"suffixes":[]},{"firstnames":["Cecily"],"propositions":[],"lastnames":["Raynor"],"suffixes":[]},{"firstnames":["Dominic"],"propositions":[],"lastnames":["Forest"],"suffixes":[]},{"firstnames":["Michael"],"propositions":[],"lastnames":["Sinatra"],"suffixes":[]},{"firstnames":["Stéfan"],"propositions":[],"lastnames":["Sinclair"],"suffixes":[]}],"organization":"Alliance of Digital Humanities Organizations","keywords":"Hittite, Cuneiform, Dictionary, POSTagging, Semantic Web","bibtex":"@inproceedings{homburg2017postagging,\n\ttitle = {POSTagging and Semantic Dictionary Creation for Hittite Cuneiform},\n\tauthor = {Homburg, Timo},\n\tyear = 2017,\n\tmonth = aug,\n\tday = 9,\n\tbooktitle = {Digital Humanities 2017, {DH} 2017, Conference Abstracts, McGill University {\\&} Universit{\\'{e}} de Montr{\\'{e}}al, Montr{\\'{e}}al, Canada, August 8-11, 2017},\n\tpublisher = {Alliance of Digital Humanities Organizations {(ADHO)}},\n\taddress = {Montréal, Canada},\n\turl = {https://dh2017.adho.org/abstracts/139/139.pdf},\n\tabstract = {Presentation Topic and State Of The Art On\tour\tposter\twe\twant\tto\tpresent\tongoing\twork\tto create\tan\tautomatic\tnatural\tlanguage\tprocessing\t tool for\t Hittite\t cuneiform.\t Hittite\t cuneiform\t texts\t are\t to this\t day\t manually\t transcribed\t by\t the\t respective\t experts and\t then\t published\t in\t a\t transliteration\t format (commonly\t ATF). Pictures\t of\t the\t original\t cuneiform tablet\t may\t be\t provided\t and\t more\t rarely\t cuneiform representations\tin\tUnicode\tare\tpresent.\tDue\tto\trecent advancements\tin\t the\t field (such\tas\tCuneify) an\tautomatic translation\t of\t many\t Hittite\t cuneiform\t transliterations to\ttheir\trespective\tcuneiform\trepresentation is\tpossible. Research Contributions We\tbuild\tupon\tthis\twork\tby\tcreating\ttools that\taim to\t automatically\t translate\t Hittite\t cuneiform\t texts\t to English\t from\teither\ta\tUnicode\tcuneiform\trepresentation or\ttheir\ttransliteration\trepresentation. POSTagger We\t have\t created\t a\t morphological\t analyzer\t to\t detect nouns,\t verbs,\t several\t kinds\t of\t pronouns,\t their respective\t declinations\t and\t appendices\t as\t well\t as structural\tparticles. On\ta\tsample\tset\tof\tannotated\tHittite texts\t from\t different\t epochs\t in\t cuneiform\t and transliteration\t representation\t we\t have\t evaluated\t the morphological\tanalyzer,\tits\tadvantages,\tproblems\tand possible\tsolutions\tand\tintend\tto\tpresent\tthe\tresults\tas well\tas\t some\t POSTagging\texamples\tin\t section\t one\t of our\tposter. Dictionary Creation Dictionaries\t for\t Hittite\t cuneiform\t exist\t in\t often non-machine\treadable\tformats\tand\twithout\ta\tconnection to\tSemantic\tWeb\t concepts.\tWe\tintend\t to\t change this\t situation\t by\t parsing\t digitally\t available\t nonsemantic dictionaries\tand\tusing\tmatching\talgorithms\tto find\t concepts\t of\t the\tEnglish\t translations\t of\t such\t dictionaries in\tthe\tSemantic\tWeb\te.g.\tDBPedia\tor\tWikidata. Dictionaries\tof\tthis\tkind\tare\tstored\tusing\tthe\tLexical Model\t for\t Ontologies\t (Lemon). In\t addition\t to freely\t available\t dictionaries\t we\t intend\t to\t use\t expert resources developed\t by\t the\t academy\t of\t sciences\t in Mainz/Germany to\t verify\t and\t extend\t our\t generated dictionaries.\tWe\tintend\tto\tpresent\tthe\tdictionary\tcreation process,\tstatistics\tabout\tthe\tcontent\tof\tgenerated dictionaries\t and\t their impact\t in\t section\t two\t of\t our poster. Machine Translation Using\tthe\tnewly\tcreated\tdictionaries\tas\twell\tas\tthe POSTagging\tinformation\twe\tintend\tto\ttest\tseveral\tautomated machine\ttranslation approaches\tof\twhich\twe will\t outline\t the\t process\t and\t possible\t approaches\t in poster\tsection\tthree. Contributions for the Communities With\t our\t approaches\t we\t intend\t to\t contribute to the\tarchaeological\tcommunity\tin\tGermany by analysing Hittite\t cuneiform\t tablets.\t Together\t with\t work from\t the\t University\t of\tHeidelberg\t on\timage\t recognition of\tcuneiform\t tablets, we\twant\t to\t focus\ton\tcreating a\tnatural\tlanguage\tprocessing\tpipeline\tfrom\tscanning cuneiform\t tablets\t to\t an\t available\t translation\t in English.},\n\tlanguage = {english},\n\teditor = {Rhian Lewis and Cecily Raynor and Dominic Forest and Michael Sinatra and St{\\'{e}}fan Sinclair},\n\torganization = {Alliance of Digital Humanities Organizations},\n\tkeywords = {Hittite, Cuneiform, Dictionary, POSTagging, Semantic Web}\n}\n","author_short":["Homburg, T."],"editor_short":["Lewis, R.","Raynor, C.","Forest, D.","Sinatra, M.","Sinclair, S."],"key":"homburg2017postagging","id":"homburg2017postagging","bibbaseid":"homburg-postaggingandsemanticdictionarycreationforhittitecuneiform-2017","role":"author","urls":{"Paper":"https://dh2017.adho.org/abstracts/139/139.pdf"},"keyword":["Hittite","Cuneiform","Dictionary","POSTagging","Semantic Web"],"metadata":{"authorlinks":{"homburg, t":"https://situx.github.io/publications/"}},"downloads":1},"search_terms":["postagging","semantic","dictionary","creation","hittite","cuneiform","homburg"],"keywords":["hittite","cuneiform","dictionary","postagging","semantic web"],"authorIDs":["59b06af2ca6d82951a000044","5e68d4e5ae547ede0100022f","9TEtjjjdbfSJWcmkG","HFxXQkjiZWuH9MdZu","iKvcRuwbCevLqLEJ2","ktGXLCmpEANf6dHGY","y4pGMhn9FnSgTuwLA"],"dataSources":["69aQ62M9K62mHKxJs","BqMwnE4BZBPKrDxbx","4LEnGX3MWnZSzqaRo"]}