Unsupervised Domain Adaptation of Contextualized Embeddings: A Case Study in Early Modern English. Han, X. & Eisenstein, J.
Unsupervised Domain Adaptation of Contextualized Embeddings: A Case Study in Early Modern English [link]Paper  abstract   bibtex   
Contextualized word embeddings such as ELMo and BERT provide a foundation for strong performance across a range of natural language processing tasks, in part by pretraining on a large and topically-diverse corpus. However, the applicability of this approach is unknown when the target domain varies substantially from the text used during pretraining. Specifically, we are interested the scenario in which labeled data is available in only a canonical source domain such as newstext, and the target domain is distinct from both the labeled corpus and the pretraining data. To address this scenario, we propose domain-adaptive fine-tuning, in which the contextualized embeddings are adapted by masked language modeling on the target domain. We test this approach on the challenging domain of Early Modern English, which differs substantially from existing pretraining corpora. Domain-adaptive fine-tuning yields an improvement of 4\textbackslash% in part-of-speech tagging accuracy over a BERT baseline, substantially improving on prior work on this task.
@article{hanUnsupervisedDomainAdaptation2019,
  archivePrefix = {arXiv},
  eprinttype = {arxiv},
  eprint = {1904.02817},
  primaryClass = {cs},
  title = {Unsupervised {{Domain Adaptation}} of {{Contextualized Embeddings}}: {{A Case Study}} in {{Early Modern English}}},
  url = {http://arxiv.org/abs/1904.02817},
  shorttitle = {Unsupervised {{Domain Adaptation}} of {{Contextualized Embeddings}}},
  abstract = {Contextualized word embeddings such as ELMo and BERT provide a foundation for strong performance across a range of natural language processing tasks, in part by pretraining on a large and topically-diverse corpus. However, the applicability of this approach is unknown when the target domain varies substantially from the text used during pretraining. Specifically, we are interested the scenario in which labeled data is available in only a canonical source domain such as newstext, and the target domain is distinct from both the labeled corpus and the pretraining data. To address this scenario, we propose domain-adaptive fine-tuning, in which the contextualized embeddings are adapted by masked language modeling on the target domain. We test this approach on the challenging domain of Early Modern English, which differs substantially from existing pretraining corpora. Domain-adaptive fine-tuning yields an improvement of 4\textbackslash\% in part-of-speech tagging accuracy over a BERT baseline, substantially improving on prior work on this task.},
  urldate = {2019-04-08},
  date = {2019-04-04},
  keywords = {Computer Science - Digital Libraries,Computer Science - Computation and Language,Computer Science - Machine Learning},
  author = {Han, Xiaochuang and Eisenstein, Jacob},
  file = {/home/dimitri/Nextcloud/Zotero/storage/RT8FLXP4/Han and Eisenstein - 2019 - Unsupervised Domain Adaptation of Contextualized E.pdf;/home/dimitri/Nextcloud/Zotero/storage/4B9LSUQF/1904.html}
}
Downloads: 0