Contextual String Embeddings for Sequence Labeling

Contextual String Embeddings for Sequence Labeling. Akbik, A., Blythe, D., & Vollgraf, R. In COLING 2018, 27th International Conference on Computational Linguistics, pages 1638–1649, 2018.
abstract bibtex

Recent advances in language modeling using recurrent neural networks have made it viable to model language as distributions over characters. By learning to predict the next character on the basis of previous characters, such models have been shown to automatically internalize linguistic concepts such as words, sentences, subclauses and even sentiment. In this paper, we propose to leverage the internal states of a trained character language model to produce a novel type of word embedding which we refer to as contextual string embeddings. Our proposed embeddings have the distinct properties that they (a) are trained without any explicit notion of words and thus fundamentally model words as sequences of characters, and (b) are contextualized by their surrounding text, meaning that the same word will have different embeddings depending on its contextual use. We conduct a comparative evaluation against previous embeddings and find that our embeddings are highly useful for downstream tasks: across four classic sequence labeling tasks we consistently outperform the previous state-of-the-art. In particular, we significantly outperform previous work on English and German named entity recognition (NER), allowing us to report new state-of-the-art F1-scores on the C O NLL03 shared task. We release all code and pre-trained language models in a simple-to-use framework to the re- search community, to enable reproduction of these experiments and application of our proposed embeddings to other tasks: https://github.com/zalandoresearch/flair

@inproceedings{akbik_contextual_2018,
	title = {Contextual {String} {Embeddings} for {Sequence} {Labeling}},
	abstract = {Recent advances in language modeling using recurrent neural networks have made it viable to
model language as distributions over characters. By learning to predict the next character on the
basis of previous characters, such models have been shown to automatically internalize linguistic
concepts such as words, sentences, subclauses and even sentiment. In this paper, we propose
to leverage the internal states of a trained character language model to produce a novel type of
word embedding which we refer to as contextual string embeddings. Our proposed embeddings
have the distinct properties that they (a) are trained without any explicit notion of words and
thus fundamentally model words as sequences of characters, and (b) are contextualized by their
surrounding text, meaning that the same word will have different embeddings depending on its
contextual use. We conduct a comparative evaluation against previous embeddings and find that
our embeddings are highly useful for downstream tasks: across four classic sequence labeling
tasks we consistently outperform the previous state-of-the-art. In particular, we significantly
outperform previous work on English and German named entity recognition (NER), allowing us
to report new state-of-the-art F1-scores on the C O NLL03 shared task.
We release all code and pre-trained language models in a simple-to-use framework to the re-
search community, to enable reproduction of these experiments and application of our proposed
embeddings to other tasks: https://github.com/zalandoresearch/flair},
	booktitle = {{COLING} 2018, 27th {International} {Conference} on {Computational} {Linguistics}},
	author = {Akbik, Alan and Blythe, Duncan and Vollgraf, Roland},
	year = {2018},
	pages = {1638--1649},
}

Downloads: 0

{"_id":"5FodADgWLC3gRGQsE","bibbaseid":"akbik-blythe-vollgraf-contextualstringembeddingsforsequencelabeling-2018","author_short":["Akbik, A.","Blythe, D.","Vollgraf, R."],"bibdata":{"bibtype":"inproceedings","type":"inproceedings","title":"Contextual String Embeddings for Sequence Labeling","abstract":"Recent advances in language modeling using recurrent neural networks have made it viable to model language as distributions over characters. By learning to predict the next character on the basis of previous characters, such models have been shown to automatically internalize linguistic concepts such as words, sentences, subclauses and even sentiment. In this paper, we propose to leverage the internal states of a trained character language model to produce a novel type of word embedding which we refer to as contextual string embeddings. Our proposed embeddings have the distinct properties that they (a) are trained without any explicit notion of words and thus fundamentally model words as sequences of characters, and (b) are contextualized by their surrounding text, meaning that the same word will have different embeddings depending on its contextual use. We conduct a comparative evaluation against previous embeddings and find that our embeddings are highly useful for downstream tasks: across four classic sequence labeling tasks we consistently outperform the previous state-of-the-art. In particular, we significantly outperform previous work on English and German named entity recognition (NER), allowing us to report new state-of-the-art F1-scores on the C O NLL03 shared task. We release all code and pre-trained language models in a simple-to-use framework to the re- search community, to enable reproduction of these experiments and application of our proposed embeddings to other tasks: https://github.com/zalandoresearch/flair","booktitle":"COLING 2018, 27th International Conference on Computational Linguistics","author":[{"propositions":[],"lastnames":["Akbik"],"firstnames":["Alan"],"suffixes":[]},{"propositions":[],"lastnames":["Blythe"],"firstnames":["Duncan"],"suffixes":[]},{"propositions":[],"lastnames":["Vollgraf"],"firstnames":["Roland"],"suffixes":[]}],"year":"2018","pages":"1638–1649","bibtex":"@inproceedings{akbik_contextual_2018,\n\ttitle = {Contextual {String} {Embeddings} for {Sequence} {Labeling}},\n\tabstract = {Recent advances in language modeling using recurrent neural networks have made it viable to\nmodel language as distributions over characters. By learning to predict the next character on the\nbasis of previous characters, such models have been shown to automatically internalize linguistic\nconcepts such as words, sentences, subclauses and even sentiment. In this paper, we propose\nto leverage the internal states of a trained character language model to produce a novel type of\nword embedding which we refer to as contextual string embeddings. Our proposed embeddings\nhave the distinct properties that they (a) are trained without any explicit notion of words and\nthus fundamentally model words as sequences of characters, and (b) are contextualized by their\nsurrounding text, meaning that the same word will have different embeddings depending on its\ncontextual use. We conduct a comparative evaluation against previous embeddings and find that\nour embeddings are highly useful for downstream tasks: across four classic sequence labeling\ntasks we consistently outperform the previous state-of-the-art. In particular, we significantly\noutperform previous work on English and German named entity recognition (NER), allowing us\nto report new state-of-the-art F1-scores on the C O NLL03 shared task.\nWe release all code and pre-trained language models in a simple-to-use framework to the re-\nsearch community, to enable reproduction of these experiments and application of our proposed\nembeddings to other tasks: https://github.com/zalandoresearch/flair},\n\tbooktitle = {{COLING} 2018, 27th {International} {Conference} on {Computational} {Linguistics}},\n\tauthor = {Akbik, Alan and Blythe, Duncan and Vollgraf, Roland},\n\tyear = {2018},\n\tpages = {1638--1649},\n}\n\n\n\n\n\n\n\n","author_short":["Akbik, A.","Blythe, D.","Vollgraf, R."],"key":"akbik_contextual_2018","id":"akbik_contextual_2018","bibbaseid":"akbik-blythe-vollgraf-contextualstringembeddingsforsequencelabeling-2018","role":"author","urls":{},"metadata":{"authorlinks":{}},"downloads":0,"html":""},"bibtype":"inproceedings","biburl":"https://bibbase.org/zotero/thomaskrause","dataSources":["AtpmbBy7pywMXxoua"],"keywords":[],"search_terms":["contextual","string","embeddings","sequence","labeling","akbik","blythe","vollgraf"],"title":"Contextual String Embeddings for Sequence Labeling","year":2018}