Improving Low Resource Named Entity Recognition using Cross-lingual Knowledge Transfer. Feng, X., Feng, X., Qin, B., Feng, Z., & Liu, T. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, pages 4071–4077, Stockholm, Sweden, July, 2018. International Joint Conferences on Artificial Intelligence Organization.
Improving Low Resource Named Entity Recognition using Cross-lingual Knowledge Transfer [link]Paper  doi  abstract   bibtex   
Neural networks have been widely used for high resource language (e.g. English) named entity recognition (NER) and have shown state-of-the-art results. However, for low resource languages, such as Dutch and Spanish, due to the limitation of resources and lack of annotated data, NER models tend to have lower performances. To narrow this gap, we investigate cross-lingual knowledge to enrich the semantic representations of low resource languages. We first develop neural networks to improve low resource word representations via knowledge transfer from high resource language using bilingual lexicons. Further, a lexicon extension strategy is designed to address out-of lexicon problem by automatically learning semantic projections. Finally, we regard word-level entity type distribution features as an external languageindependent knowledge and incorporate them into our neural architecture. Experiments on two low resource languages (Dutch and Spanish) demonstrate the effectiveness of these additional semantic representations (average 4.8% improvement). Moreover, on Chinese OntoNotes 4.0 dataset, our approach achieves an F-score of 83.07% with 2.91% absolute gain compared to the state-of-the-art systems.
@inproceedings{feng_improving_2018,
	address = {Stockholm, Sweden},
	title = {Improving {Low} {Resource} {Named} {Entity} {Recognition} using {Cross}-lingual {Knowledge} {Transfer}},
	isbn = {978-0-9992411-2-7},
	url = {https://www.ijcai.org/proceedings/2018/566},
	doi = {10.24963/ijcai.2018/566},
	abstract = {Neural networks have been widely used for high resource language (e.g. English) named entity recognition (NER) and have shown state-of-the-art results. However, for low resource languages, such as Dutch and Spanish, due to the limitation of resources and lack of annotated data, NER models tend to have lower performances. To narrow this gap, we investigate cross-lingual knowledge to enrich the semantic representations of low resource languages. We first develop neural networks to improve low resource word representations via knowledge transfer from high resource language using bilingual lexicons. Further, a lexicon extension strategy is designed to address out-of lexicon problem by automatically learning semantic projections. Finally, we regard word-level entity type distribution features as an external languageindependent knowledge and incorporate them into our neural architecture. Experiments on two low resource languages (Dutch and Spanish) demonstrate the effectiveness of these additional semantic representations (average 4.8\% improvement). Moreover, on Chinese OntoNotes 4.0 dataset, our approach achieves an F-score of 83.07\% with 2.91\% absolute gain compared to the state-of-the-art systems.},
	language = {en},
	urldate = {2025-01-01},
	booktitle = {Proceedings of the {Twenty}-{Seventh} {International} {Joint} {Conference} on {Artificial} {Intelligence}},
	publisher = {International Joint Conferences on Artificial Intelligence Organization},
	author = {Feng, Xiaocheng and Feng, Xiachong and Qin, Bing and Feng, Zhangyin and Liu, Ting},
	month = jul,
	year = {2018},
	pages = {4071--4077},
}

Downloads: 0