Cross-lingual Name Tagging and Linking for 282 Languages. Pan, X., Zhang, B., May, J., Nothman, J., Knight, K., & Ji, H. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1946–1958, Vancouver, Canada, July, 2017. Association for Computational Linguistics.
Cross-lingual Name Tagging and Linking for 282 Languages [link]Paper  abstract   bibtex   
The ambitious goal of this work is to develop a cross-lingual name tagging and linking framework for 282 languages that exist in Wikipedia. Given a document in any of these languages, our framework is able to identify name mentions, assign a coarse-grained or fine-grained type to each mention, and link it to an English Knowledge Base (KB) if it is linkable. We achieve this goal by performing a series of new KB mining methods: generating ``silver-standard'' annotations by transferring annotations from English to other languages through cross-lingual links and KB properties, refining annotations through self-training and topic selection, deriving language-specific morphology features from anchor links, and mining word translation pairs from cross-lingual links. Both name tagging and linking results for 282 languages are promising on Wikipedia data and on-Wikipedia data.
@InProceedings{pan-EtAl:2017:Long2,
  author    = {Pan, Xiaoman  and  Zhang, Boliang  and  May, Jonathan  and  Nothman, Joel  and  Knight, Kevin  and  Ji, Heng},
  title     = {Cross-lingual Name Tagging and Linking for 282 Languages},
  booktitle = {Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
  month     = {July},
  year      = {2017},
  address   = {Vancouver, Canada},
  publisher = {Association for Computational Linguistics},
  pages     = {1946--1958},
  abstract  = {The ambitious goal of this work is to develop a cross-lingual name tagging and
	linking framework for 282 languages that exist in Wikipedia. Given a document
	in any of these languages, our framework is able to identify name mentions,
	assign a coarse-grained or fine-grained type to each mention, and link it to an
	English Knowledge Base (KB) if it is linkable. We achieve this goal by
	performing a series of new KB mining methods: generating ``silver-standard''
	annotations by transferring annotations from English to other languages through
	cross-lingual links and KB properties, refining annotations through
	self-training and topic selection, deriving language-specific morphology
	features from anchor links, and mining word translation pairs from
	cross-lingual links. Both name tagging and linking results for 282 languages
	are promising on Wikipedia data and on-Wikipedia data.},
  url       = {http://aclweb.org/anthology/P17-1178}
}

Downloads: 0