Learning and Evaluating General Linguistic Intelligence

Learning and Evaluating General Linguistic Intelligence. Yogatama, D., family=Autume , g. d. M., Connor, J., Kocisky, T., Chrzanowski, M., Kong, L., Lazaridou, A., Ling, W., Yu, L., Dyer, C., & Blunsom, P.

Paper abstract bibtex

We define general linguistic intelligence as the ability to reuse previously acquired knowledge about a language's lexicon, syntax, semantics, and pragmatic conventions to adapt to new tasks quickly. Using this definition, we analyze state-of-the-art natural language understanding models and conduct an extensive empirical investigation to evaluate them against these criteria through a series of experiments that assess the task-independence of the knowledge being acquired by the learning process. In addition to task performance, we propose a new evaluation metric based on an online encoding of the test data that quantifies how quickly an existing agent (model) learns a new task. Our results show that while the field has made impressive progress in terms of model architectures that generalize to many tasks, these models still require a lot of in-domain training examples (e.g., for fine tuning, training task-specific modules), and are prone to catastrophic forgetting. Moreover, we find that far from solving general tasks (e.g., document question answering), our models are overfitting to the quirks of particular datasets (e.g., SQuAD). We discuss missing components and conjecture on how to make progress toward general linguistic intelligence.

@article{yogatamaLearningEvaluatingGeneral2019,
  archivePrefix = {arXiv},
  eprinttype = {arxiv},
  eprint = {1901.11373},
  primaryClass = {cs, stat},
  title = {Learning and {{Evaluating General Linguistic Intelligence}}},
  url = {http://arxiv.org/abs/1901.11373},
  abstract = {We define general linguistic intelligence as the ability to reuse previously acquired knowledge about a language's lexicon, syntax, semantics, and pragmatic conventions to adapt to new tasks quickly. Using this definition, we analyze state-of-the-art natural language understanding models and conduct an extensive empirical investigation to evaluate them against these criteria through a series of experiments that assess the task-independence of the knowledge being acquired by the learning process. In addition to task performance, we propose a new evaluation metric based on an online encoding of the test data that quantifies how quickly an existing agent (model) learns a new task. Our results show that while the field has made impressive progress in terms of model architectures that generalize to many tasks, these models still require a lot of in-domain training examples (e.g., for fine tuning, training task-specific modules), and are prone to catastrophic forgetting. Moreover, we find that far from solving general tasks (e.g., document question answering), our models are overfitting to the quirks of particular datasets (e.g., SQuAD). We discuss missing components and conjecture on how to make progress toward general linguistic intelligence.},
  urldate = {2019-02-04},
  date = {2019-01-31},
  keywords = {Statistics - Machine Learning,Computer Science - Computation and Language,Computer Science - Machine Learning},
  author = {Yogatama, Dani and family=Autume, given=Cyprien de Masson, prefix=d', useprefix=true and Connor, Jerome and Kocisky, Tomas and Chrzanowski, Mike and Kong, Lingpeng and Lazaridou, Angeliki and Ling, Wang and Yu, Lei and Dyer, Chris and Blunsom, Phil},
  file = {/home/dimitri/Nextcloud/Zotero/storage/FP7K77IR/Yogatama et al. - 2019 - Learning and Evaluating General Linguistic Intelli.pdf;/home/dimitri/Nextcloud/Zotero/storage/8V7JCZPB/1901.html}
}

Downloads: 0

{"_id":"NZtvEphd8JqjGcuFD","bibbaseid":"yogatama-familyautume-connor-kocisky-chrzanowski-kong-lazaridou-ling-etal-learningandevaluatinggenerallinguisticintelligence","authorIDs":[],"author_short":["Yogatama, D.","family=Autume , g. d. M.","Connor, J.","Kocisky, T.","Chrzanowski, M.","Kong, L.","Lazaridou, A.","Ling, W.","Yu, L.","Dyer, C.","Blunsom, P."],"bibdata":{"bibtype":"article","type":"article","archiveprefix":"arXiv","eprinttype":"arxiv","eprint":"1901.11373","primaryclass":"cs, stat","title":"Learning and Evaluating General Linguistic Intelligence","url":"http://arxiv.org/abs/1901.11373","abstract":"We define general linguistic intelligence as the ability to reuse previously acquired knowledge about a language's lexicon, syntax, semantics, and pragmatic conventions to adapt to new tasks quickly. Using this definition, we analyze state-of-the-art natural language understanding models and conduct an extensive empirical investigation to evaluate them against these criteria through a series of experiments that assess the task-independence of the knowledge being acquired by the learning process. In addition to task performance, we propose a new evaluation metric based on an online encoding of the test data that quantifies how quickly an existing agent (model) learns a new task. Our results show that while the field has made impressive progress in terms of model architectures that generalize to many tasks, these models still require a lot of in-domain training examples (e.g., for fine tuning, training task-specific modules), and are prone to catastrophic forgetting. Moreover, we find that far from solving general tasks (e.g., document question answering), our models are overfitting to the quirks of particular datasets (e.g., SQuAD). We discuss missing components and conjecture on how to make progress toward general linguistic intelligence.","urldate":"2019-02-04","date":"2019-01-31","keywords":"Statistics - Machine Learning,Computer Science - Computation and Language,Computer Science - Machine Learning","author":[{"propositions":[],"lastnames":["Yogatama"],"firstnames":["Dani"],"suffixes":[]},{"propositions":["family=Autume"],"lastnames":[],"firstnames":["given=Cyprien","de","Masson"],"suffixes":["prefix=d'",",","useprefix=true"]},{"propositions":[],"lastnames":["Connor"],"firstnames":["Jerome"],"suffixes":[]},{"propositions":[],"lastnames":["Kocisky"],"firstnames":["Tomas"],"suffixes":[]},{"propositions":[],"lastnames":["Chrzanowski"],"firstnames":["Mike"],"suffixes":[]},{"propositions":[],"lastnames":["Kong"],"firstnames":["Lingpeng"],"suffixes":[]},{"propositions":[],"lastnames":["Lazaridou"],"firstnames":["Angeliki"],"suffixes":[]},{"propositions":[],"lastnames":["Ling"],"firstnames":["Wang"],"suffixes":[]},{"propositions":[],"lastnames":["Yu"],"firstnames":["Lei"],"suffixes":[]},{"propositions":[],"lastnames":["Dyer"],"firstnames":["Chris"],"suffixes":[]},{"propositions":[],"lastnames":["Blunsom"],"firstnames":["Phil"],"suffixes":[]}],"file":"/home/dimitri/Nextcloud/Zotero/storage/FP7K77IR/Yogatama et al. - 2019 - Learning and Evaluating General Linguistic Intelli.pdf;/home/dimitri/Nextcloud/Zotero/storage/8V7JCZPB/1901.html","bibtex":"@article{yogatamaLearningEvaluatingGeneral2019,\n archivePrefix = {arXiv},\n eprinttype = {arxiv},\n eprint = {1901.11373},\n primaryClass = {cs, stat},\n title = {Learning and {{Evaluating General Linguistic Intelligence}}},\n url = {http://arxiv.org/abs/1901.11373},\n abstract = {We define general linguistic intelligence as the ability to reuse previously acquired knowledge about a language's lexicon, syntax, semantics, and pragmatic conventions to adapt to new tasks quickly. Using this definition, we analyze state-of-the-art natural language understanding models and conduct an extensive empirical investigation to evaluate them against these criteria through a series of experiments that assess the task-independence of the knowledge being acquired by the learning process. In addition to task performance, we propose a new evaluation metric based on an online encoding of the test data that quantifies how quickly an existing agent (model) learns a new task. Our results show that while the field has made impressive progress in terms of model architectures that generalize to many tasks, these models still require a lot of in-domain training examples (e.g., for fine tuning, training task-specific modules), and are prone to catastrophic forgetting. Moreover, we find that far from solving general tasks (e.g., document question answering), our models are overfitting to the quirks of particular datasets (e.g., SQuAD). We discuss missing components and conjecture on how to make progress toward general linguistic intelligence.},\n urldate = {2019-02-04},\n date = {2019-01-31},\n keywords = {Statistics - Machine Learning,Computer Science - Computation and Language,Computer Science - Machine Learning},\n author = {Yogatama, Dani and family=Autume, given=Cyprien de Masson, prefix=d', useprefix=true and Connor, Jerome and Kocisky, Tomas and Chrzanowski, Mike and Kong, Lingpeng and Lazaridou, Angeliki and Ling, Wang and Yu, Lei and Dyer, Chris and Blunsom, Phil},\n file = {/home/dimitri/Nextcloud/Zotero/storage/FP7K77IR/Yogatama et al. - 2019 - Learning and Evaluating General Linguistic Intelli.pdf;/home/dimitri/Nextcloud/Zotero/storage/8V7JCZPB/1901.html}\n}\n\n","author_short":["Yogatama, D.","family=Autume , g. d. M.","Connor, J.","Kocisky, T.","Chrzanowski, M.","Kong, L.","Lazaridou, A.","Ling, W.","Yu, L.","Dyer, C.","Blunsom, P."],"key":"yogatamaLearningEvaluatingGeneral2019","id":"yogatamaLearningEvaluatingGeneral2019","bibbaseid":"yogatama-familyautume-connor-kocisky-chrzanowski-kong-lazaridou-ling-etal-learningandevaluatinggenerallinguisticintelligence","role":"author","urls":{"Paper":"http://arxiv.org/abs/1901.11373"},"keyword":["Statistics - Machine Learning","Computer Science - Computation and Language","Computer Science - Machine Learning"],"downloads":0},"bibtype":"article","biburl":"https://raw.githubusercontent.com/dlozeve/newblog/master/bib/all.bib","creationDate":"2020-01-08T20:39:39.246Z","downloads":0,"keywords":["statistics - machine learning","computer science - computation and language","computer science - machine learning"],"search_terms":["learning","evaluating","general","linguistic","intelligence","yogatama","family=autume ","connor","kocisky","chrzanowski","kong","lazaridou","ling","yu","dyer","blunsom"],"title":"Learning and Evaluating General Linguistic Intelligence","year":null,"dataSources":["3XqdvqRE7zuX4cm8m"]}