Named Entity Recognition for Hungarian Using Various Machine Learning Algorithms. Farkas, R., Szarvas, G., & Kocsor, A. Acta Cybernetica, 00(0000):1-15, Citeseer, 2006. Website abstract bibtex In this paper we introduce a statistical Named Entity recognizer (NER) system for the Hungarian language. We examined three methods for identifying and disambiguating proper nouns (Artificial Neural Network, Support Vector Machine, C4.5 Decision Tree), their combinations and the effects of dimensionality reduction as well. We used a segment of Szeged Corpus 7 for training and validation purposes, which consists of short business news articles collected from MTI (Hungarian News Agency, www.mti.hu). Our results were presented at the Second Conference on Hungarian Computational Linguistics 9. Our system makes use of both language dependent features (describing the orthography of proper nouns in Hungarian) and other, language independent information such as capitalization. Since we avoided the inclusion of large gazetteers of pre-classified entities, the system remains portable across languages without requiring any major modification, as long as the few specialized orthographical and syntactic characteristics are collected for a new target language. The best performing model achieved an F measure accuracy of 91.95%.
@article{
title = {Named Entity Recognition for Hungarian Using Various Machine Learning Algorithms},
type = {article},
year = {2006},
identifiers = {[object Object]},
keywords = {machine learning,named entity recognition,statistical models},
pages = {1-15},
volume = {00},
websites = {http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.107.4317&rep=rep1&type=pdf},
publisher = {Citeseer},
id = {f2ae5ef7-5d52-35f3-9687-1e9a0dc55aab},
created = {2011-12-29T19:53:53.000Z},
file_attached = {false},
profile_id = {5284e6aa-156c-3ce5-bc0e-b80cf09f3ef6},
group_id = {066b42c8-f712-3fc3-abb2-225c158d2704},
last_modified = {2017-03-14T14:36:19.698Z},
tags = {named entity recognition},
read = {false},
starred = {false},
authored = {false},
confirmed = {true},
hidden = {false},
citation_key = {Farkas2006},
private_publication = {false},
abstract = {In this paper we introduce a statistical Named Entity recognizer (NER) system for the Hungarian language. We examined three methods for identifying and disambiguating proper nouns (Artificial Neural Network, Support Vector Machine, C4.5 Decision Tree), their combinations and the effects of dimensionality reduction as well. We used a segment of Szeged Corpus 7 for training and validation purposes, which consists of short business news articles collected from MTI (Hungarian News Agency, www.mti.hu). Our results were presented at the Second Conference on Hungarian Computational Linguistics 9. Our system makes use of both language dependent features (describing the orthography of proper nouns in Hungarian) and other, language independent information such as capitalization. Since we avoided the inclusion of large gazetteers of pre-classified entities, the system remains portable across languages without requiring any major modification, as long as the few specialized orthographical and syntactic characteristics are collected for a new target language. The best performing model achieved an F measure accuracy of 91.95%.},
bibtype = {article},
author = {Farkas, R and Szarvas, G and Kocsor, A},
journal = {Acta Cybernetica},
number = {0000}
}
Downloads: 0
{"_id":"5jAhHByNCCnjwCpRo","bibbaseid":"farkas-szarvas-kocsor-namedentityrecognitionforhungarianusingvariousmachinelearningalgorithms-2006","authorIDs":[],"author_short":["Farkas, R.","Szarvas, G.","Kocsor, A."],"bibdata":{"title":"Named Entity Recognition for Hungarian Using Various Machine Learning Algorithms","type":"article","year":"2006","identifiers":"[object Object]","keywords":"machine learning,named entity recognition,statistical models","pages":"1-15","volume":"00","websites":"http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.107.4317&rep=rep1&type=pdf","publisher":"Citeseer","id":"f2ae5ef7-5d52-35f3-9687-1e9a0dc55aab","created":"2011-12-29T19:53:53.000Z","file_attached":false,"profile_id":"5284e6aa-156c-3ce5-bc0e-b80cf09f3ef6","group_id":"066b42c8-f712-3fc3-abb2-225c158d2704","last_modified":"2017-03-14T14:36:19.698Z","tags":"named entity recognition","read":false,"starred":false,"authored":false,"confirmed":"true","hidden":false,"citation_key":"Farkas2006","private_publication":false,"abstract":"In this paper we introduce a statistical Named Entity recognizer (NER) system for the Hungarian language. We examined three methods for identifying and disambiguating proper nouns (Artificial Neural Network, Support Vector Machine, C4.5 Decision Tree), their combinations and the effects of dimensionality reduction as well. We used a segment of Szeged Corpus 7 for training and validation purposes, which consists of short business news articles collected from MTI (Hungarian News Agency, www.mti.hu). Our results were presented at the Second Conference on Hungarian Computational Linguistics 9. Our system makes use of both language dependent features (describing the orthography of proper nouns in Hungarian) and other, language independent information such as capitalization. Since we avoided the inclusion of large gazetteers of pre-classified entities, the system remains portable across languages without requiring any major modification, as long as the few specialized orthographical and syntactic characteristics are collected for a new target language. The best performing model achieved an F measure accuracy of 91.95%.","bibtype":"article","author":"Farkas, R and Szarvas, G and Kocsor, A","journal":"Acta Cybernetica","number":"0000","bibtex":"@article{\n title = {Named Entity Recognition for Hungarian Using Various Machine Learning Algorithms},\n type = {article},\n year = {2006},\n identifiers = {[object Object]},\n keywords = {machine learning,named entity recognition,statistical models},\n pages = {1-15},\n volume = {00},\n websites = {http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.107.4317&rep=rep1&type=pdf},\n publisher = {Citeseer},\n id = {f2ae5ef7-5d52-35f3-9687-1e9a0dc55aab},\n created = {2011-12-29T19:53:53.000Z},\n file_attached = {false},\n profile_id = {5284e6aa-156c-3ce5-bc0e-b80cf09f3ef6},\n group_id = {066b42c8-f712-3fc3-abb2-225c158d2704},\n last_modified = {2017-03-14T14:36:19.698Z},\n tags = {named entity recognition},\n read = {false},\n starred = {false},\n authored = {false},\n confirmed = {true},\n hidden = {false},\n citation_key = {Farkas2006},\n private_publication = {false},\n abstract = {In this paper we introduce a statistical Named Entity recognizer (NER) system for the Hungarian language. We examined three methods for identifying and disambiguating proper nouns (Artificial Neural Network, Support Vector Machine, C4.5 Decision Tree), their combinations and the effects of dimensionality reduction as well. We used a segment of Szeged Corpus 7 for training and validation purposes, which consists of short business news articles collected from MTI (Hungarian News Agency, www.mti.hu). Our results were presented at the Second Conference on Hungarian Computational Linguistics 9. Our system makes use of both language dependent features (describing the orthography of proper nouns in Hungarian) and other, language independent information such as capitalization. Since we avoided the inclusion of large gazetteers of pre-classified entities, the system remains portable across languages without requiring any major modification, as long as the few specialized orthographical and syntactic characteristics are collected for a new target language. The best performing model achieved an F measure accuracy of 91.95%.},\n bibtype = {article},\n author = {Farkas, R and Szarvas, G and Kocsor, A},\n journal = {Acta Cybernetica},\n number = {0000}\n}","author_short":["Farkas, R.","Szarvas, G.","Kocsor, A."],"urls":{"Website":"http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.107.4317&rep=rep1&type=pdf"},"bibbaseid":"farkas-szarvas-kocsor-namedentityrecognitionforhungarianusingvariousmachinelearningalgorithms-2006","role":"author","keyword":["machine learning","named entity recognition","statistical models"],"downloads":0,"html":""},"bibtype":"article","creationDate":"2020-02-06T23:48:11.860Z","downloads":0,"keywords":["machine learning","named entity recognition","statistical models"],"search_terms":["named","entity","recognition","hungarian","using","various","machine","learning","algorithms","farkas","szarvas","kocsor"],"title":"Named Entity Recognition for Hungarian Using Various Machine Learning Algorithms","year":2006}