Visualizing and Measuring the Geometry of BERT. Coenen, A., Reif, E., Yuan, A., Kim, B., Pearce, A., Viégas, F., & Wattenberg, M.
Visualizing and Measuring the Geometry of BERT [link]Paper  abstract   bibtex   
Transformer architectures show significant promise for natural language processing. Given that a single pretrained model can be fine-tuned to perform well on many different tasks, these networks appear to extract generally useful linguistic features. A natural question is how such networks represent this information internally. This paper describes qualitative and quantitative investigations of one particularly effective model, BERT. At a high level, linguistic features seem to be represented in separate semantic and syntactic subspaces. We find evidence of a fine-grained geometric representation of word senses. We also present empirical descriptions of syntactic representations in both attention matrices and individual word embeddings, as well as a mathematical argument to explain the geometry of these representations.
@article{coenenVisualizingMeasuringGeometry2019,
  archivePrefix = {arXiv},
  eprinttype = {arxiv},
  eprint = {1906.02715},
  primaryClass = {cs, stat},
  title = {Visualizing and {{Measuring}} the {{Geometry}} of {{BERT}}},
  url = {http://arxiv.org/abs/1906.02715},
  abstract = {Transformer architectures show significant promise for natural language processing. Given that a single pretrained model can be fine-tuned to perform well on many different tasks, these networks appear to extract generally useful linguistic features. A natural question is how such networks represent this information internally. This paper describes qualitative and quantitative investigations of one particularly effective model, BERT. At a high level, linguistic features seem to be represented in separate semantic and syntactic subspaces. We find evidence of a fine-grained geometric representation of word senses. We also present empirical descriptions of syntactic representations in both attention matrices and individual word embeddings, as well as a mathematical argument to explain the geometry of these representations.},
  urldate = {2019-06-21},
  date = {2019-06-06},
  keywords = {Statistics - Machine Learning,Computer Science - Computation and Language,Computer Science - Machine Learning},
  author = {Coenen, Andy and Reif, Emily and Yuan, Ann and Kim, Been and Pearce, Adam and Viégas, Fernanda and Wattenberg, Martin},
  file = {/home/dimitri/Nextcloud/Zotero/storage/7D3K8L65/Coenen et al. - 2019 - Visualizing and Measuring the Geometry of BERT.pdf;/home/dimitri/Nextcloud/Zotero/storage/7WX24LBK/1906.html}
}

Downloads: 0