TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents. Wolf, T., Sanh, V., Chaumond, J., & Delangue, C.
TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents [link]Paper  abstract   bibtex   
We introduce a new approach to generative data-driven dialogue systems (e.g. chatbots) called TransferTransfo which is a combination of a Transfer learning based training scheme and a high-capacity Transformer model. Fine-tuning is performed by using a multi-task objective which combines several unsupervised prediction tasks. The resulting fine-tuned model shows strong improvements over the current state-of-the-art end-to-end conversational models like memory augmented seq2seq and information-retrieval models. On the privately held PERSONA-CHAT dataset of the Conversational Intelligence Challenge 2, this approach obtains a new state-of-the-art, with respective perplexity, Hits@1 and F1 metrics of 16.28 (45 % absolute improvement), 80.7 (46 % absolute improvement) and 19.5 (20 % absolute improvement).
@article{wolfTransferTransfoTransferLearning2019,
  archivePrefix = {arXiv},
  eprinttype = {arxiv},
  eprint = {1901.08149},
  primaryClass = {cs},
  title = {{{TransferTransfo}}: {{A Transfer Learning Approach}} for {{Neural Network Based Conversational Agents}}},
  url = {http://arxiv.org/abs/1901.08149},
  shorttitle = {{{TransferTransfo}}},
  abstract = {We introduce a new approach to generative data-driven dialogue systems (e.g. chatbots) called TransferTransfo which is a combination of a Transfer learning based training scheme and a high-capacity Transformer model. Fine-tuning is performed by using a multi-task objective which combines several unsupervised prediction tasks. The resulting fine-tuned model shows strong improvements over the current state-of-the-art end-to-end conversational models like memory augmented seq2seq and information-retrieval models. On the privately held PERSONA-CHAT dataset of the Conversational Intelligence Challenge 2, this approach obtains a new state-of-the-art, with respective perplexity, Hits@1 and F1 metrics of 16.28 (45 \% absolute improvement), 80.7 (46 \% absolute improvement) and 19.5 (20 \% absolute improvement).},
  urldate = {2019-03-28},
  date = {2019-01-23},
  keywords = {Computer Science - Computation and Language},
  author = {Wolf, Thomas and Sanh, Victor and Chaumond, Julien and Delangue, Clement},
  file = {/home/dimitri/Nextcloud/Zotero/storage/KCTAMC3V/Wolf et al. - 2019 - TransferTransfo A Transfer Learning Approach for .pdf;/home/dimitri/Nextcloud/Zotero/storage/KKVJVLC2/1901.html}
}

Downloads: 0