TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents. Wolf, T., Sanh, V., Chaumond, J., & Delangue, C.
Paper abstract bibtex We introduce a new approach to generative data-driven dialogue systems (e.g. chatbots) called TransferTransfo which is a combination of a Transfer learning based training scheme and a high-capacity Transformer model. Fine-tuning is performed by using a multi-task objective which combines several unsupervised prediction tasks. The resulting fine-tuned model shows strong improvements over the current state-of-the-art end-to-end conversational models like memory augmented seq2seq and information-retrieval models. On the privately held PERSONA-CHAT dataset of the Conversational Intelligence Challenge 2, this approach obtains a new state-of-the-art, with respective perplexity, Hits@1 and F1 metrics of 16.28 (45 % absolute improvement), 80.7 (46 % absolute improvement) and 19.5 (20 % absolute improvement).
@article{wolfTransferTransfoTransferLearning2019,
archivePrefix = {arXiv},
eprinttype = {arxiv},
eprint = {1901.08149},
primaryClass = {cs},
title = {{{TransferTransfo}}: {{A Transfer Learning Approach}} for {{Neural Network Based Conversational Agents}}},
url = {http://arxiv.org/abs/1901.08149},
shorttitle = {{{TransferTransfo}}},
abstract = {We introduce a new approach to generative data-driven dialogue systems (e.g. chatbots) called TransferTransfo which is a combination of a Transfer learning based training scheme and a high-capacity Transformer model. Fine-tuning is performed by using a multi-task objective which combines several unsupervised prediction tasks. The resulting fine-tuned model shows strong improvements over the current state-of-the-art end-to-end conversational models like memory augmented seq2seq and information-retrieval models. On the privately held PERSONA-CHAT dataset of the Conversational Intelligence Challenge 2, this approach obtains a new state-of-the-art, with respective perplexity, Hits@1 and F1 metrics of 16.28 (45 \% absolute improvement), 80.7 (46 \% absolute improvement) and 19.5 (20 \% absolute improvement).},
urldate = {2019-03-28},
date = {2019-01-23},
keywords = {Computer Science - Computation and Language},
author = {Wolf, Thomas and Sanh, Victor and Chaumond, Julien and Delangue, Clement},
file = {/home/dimitri/Nextcloud/Zotero/storage/KCTAMC3V/Wolf et al. - 2019 - TransferTransfo A Transfer Learning Approach for .pdf;/home/dimitri/Nextcloud/Zotero/storage/KKVJVLC2/1901.html}
}
Downloads: 0
{"_id":"GEFm2QxvT7vTrcQCs","bibbaseid":"wolf-sanh-chaumond-delangue-transfertransfoatransferlearningapproachforneuralnetworkbasedconversationalagents","authorIDs":[],"author_short":["Wolf, T.","Sanh, V.","Chaumond, J.","Delangue, C."],"bibdata":{"bibtype":"article","type":"article","archiveprefix":"arXiv","eprinttype":"arxiv","eprint":"1901.08149","primaryclass":"cs","title":"TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents","url":"http://arxiv.org/abs/1901.08149","shorttitle":"TransferTransfo","abstract":"We introduce a new approach to generative data-driven dialogue systems (e.g. chatbots) called TransferTransfo which is a combination of a Transfer learning based training scheme and a high-capacity Transformer model. Fine-tuning is performed by using a multi-task objective which combines several unsupervised prediction tasks. The resulting fine-tuned model shows strong improvements over the current state-of-the-art end-to-end conversational models like memory augmented seq2seq and information-retrieval models. On the privately held PERSONA-CHAT dataset of the Conversational Intelligence Challenge 2, this approach obtains a new state-of-the-art, with respective perplexity, Hits@1 and F1 metrics of 16.28 (45 % absolute improvement), 80.7 (46 % absolute improvement) and 19.5 (20 % absolute improvement).","urldate":"2019-03-28","date":"2019-01-23","keywords":"Computer Science - Computation and Language","author":[{"propositions":[],"lastnames":["Wolf"],"firstnames":["Thomas"],"suffixes":[]},{"propositions":[],"lastnames":["Sanh"],"firstnames":["Victor"],"suffixes":[]},{"propositions":[],"lastnames":["Chaumond"],"firstnames":["Julien"],"suffixes":[]},{"propositions":[],"lastnames":["Delangue"],"firstnames":["Clement"],"suffixes":[]}],"file":"/home/dimitri/Nextcloud/Zotero/storage/KCTAMC3V/Wolf et al. - 2019 - TransferTransfo A Transfer Learning Approach for .pdf;/home/dimitri/Nextcloud/Zotero/storage/KKVJVLC2/1901.html","bibtex":"@article{wolfTransferTransfoTransferLearning2019,\n archivePrefix = {arXiv},\n eprinttype = {arxiv},\n eprint = {1901.08149},\n primaryClass = {cs},\n title = {{{TransferTransfo}}: {{A Transfer Learning Approach}} for {{Neural Network Based Conversational Agents}}},\n url = {http://arxiv.org/abs/1901.08149},\n shorttitle = {{{TransferTransfo}}},\n abstract = {We introduce a new approach to generative data-driven dialogue systems (e.g. chatbots) called TransferTransfo which is a combination of a Transfer learning based training scheme and a high-capacity Transformer model. Fine-tuning is performed by using a multi-task objective which combines several unsupervised prediction tasks. The resulting fine-tuned model shows strong improvements over the current state-of-the-art end-to-end conversational models like memory augmented seq2seq and information-retrieval models. On the privately held PERSONA-CHAT dataset of the Conversational Intelligence Challenge 2, this approach obtains a new state-of-the-art, with respective perplexity, Hits@1 and F1 metrics of 16.28 (45 \\% absolute improvement), 80.7 (46 \\% absolute improvement) and 19.5 (20 \\% absolute improvement).},\n urldate = {2019-03-28},\n date = {2019-01-23},\n keywords = {Computer Science - Computation and Language},\n author = {Wolf, Thomas and Sanh, Victor and Chaumond, Julien and Delangue, Clement},\n file = {/home/dimitri/Nextcloud/Zotero/storage/KCTAMC3V/Wolf et al. - 2019 - TransferTransfo A Transfer Learning Approach for .pdf;/home/dimitri/Nextcloud/Zotero/storage/KKVJVLC2/1901.html}\n}\n\n","author_short":["Wolf, T.","Sanh, V.","Chaumond, J.","Delangue, C."],"key":"wolfTransferTransfoTransferLearning2019","id":"wolfTransferTransfoTransferLearning2019","bibbaseid":"wolf-sanh-chaumond-delangue-transfertransfoatransferlearningapproachforneuralnetworkbasedconversationalagents","role":"author","urls":{"Paper":"http://arxiv.org/abs/1901.08149"},"keyword":["Computer Science - Computation and Language"],"downloads":0},"bibtype":"article","biburl":"https://raw.githubusercontent.com/dlozeve/newblog/master/bib/all.bib","creationDate":"2020-01-08T20:39:39.309Z","downloads":0,"keywords":["computer science - computation and language"],"search_terms":["transfertransfo","transfer","learning","approach","neural","network","based","conversational","agents","wolf","sanh","chaumond","delangue"],"title":"TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents","year":null,"dataSources":["3XqdvqRE7zuX4cm8m"]}