Net2Net: Accelerating Learning via Knowledge Transfer

Net2Net: Accelerating Learning via Knowledge Transfer. Chen, T., Goodfellow, I., & Shlens, J. April, 2016. arXiv:1511.05641 [cs]

Paper abstract bibtex

We introduce techniques for rapidly transferring the information stored in one neural net into another neural net. The main purpose is to accelerate the training of a signiﬁcantly larger neural net. During real-world workﬂows, one often trains very many different neural networks during the experimentation and design process. This is a wasteful process in which each new model is trained from scratch. Our Net2Net technique accelerates the experimentation process by instantaneously transferring the knowledge from a previous network to each new deeper or wider network. Our techniques are based on the concept of functionpreserving transformations between neural network speciﬁcations. This differs from previous approaches to pre-training that altered the function represented by a neural net when adding layers to it. Using our knowledge transfer mechanism to add depth to Inception modules, we demonstrate a new state of the art accuracy rating on the ImageNet dataset.

@misc{chen_net2net_2016,
	title = {{Net2Net}: {Accelerating} {Learning} via {Knowledge} {Transfer}},
	shorttitle = {{Net2Net}},
	url = {http://arxiv.org/abs/1511.05641},
	abstract = {We introduce techniques for rapidly transferring the information stored in one neural net into another neural net. The main purpose is to accelerate the training of a signiﬁcantly larger neural net. During real-world workﬂows, one often trains very many different neural networks during the experimentation and design process. This is a wasteful process in which each new model is trained from scratch. Our Net2Net technique accelerates the experimentation process by instantaneously transferring the knowledge from a previous network to each new deeper or wider network. Our techniques are based on the concept of functionpreserving transformations between neural network speciﬁcations. This differs from previous approaches to pre-training that altered the function represented by a neural net when adding layers to it. Using our knowledge transfer mechanism to add depth to Inception modules, we demonstrate a new state of the art accuracy rating on the ImageNet dataset.},
	language = {en},
	urldate = {2023-06-16},
	publisher = {arXiv},
	author = {Chen, Tianqi and Goodfellow, Ian and Shlens, Jonathon},
	month = apr,
	year = {2016},
	note = {arXiv:1511.05641 [cs]},
	keywords = {\#ICLR{\textgreater}16, \#NN, \#Transfer, /unread, Computer Science - Machine Learning},
}

Downloads: 0

{"_id":"PoGqtn4GKcaPxXtkb","bibbaseid":"chen-goodfellow-shlens-net2netacceleratinglearningviaknowledgetransfer-2016","author_short":["Chen, T.","Goodfellow, I.","Shlens, J."],"bibdata":{"bibtype":"misc","type":"misc","title":"Net2Net: Accelerating Learning via Knowledge Transfer","shorttitle":"Net2Net","url":"http://arxiv.org/abs/1511.05641","abstract":"We introduce techniques for rapidly transferring the information stored in one neural net into another neural net. The main purpose is to accelerate the training of a signiﬁcantly larger neural net. During real-world workﬂows, one often trains very many different neural networks during the experimentation and design process. This is a wasteful process in which each new model is trained from scratch. Our Net2Net technique accelerates the experimentation process by instantaneously transferring the knowledge from a previous network to each new deeper or wider network. Our techniques are based on the concept of functionpreserving transformations between neural network speciﬁcations. This differs from previous approaches to pre-training that altered the function represented by a neural net when adding layers to it. Using our knowledge transfer mechanism to add depth to Inception modules, we demonstrate a new state of the art accuracy rating on the ImageNet dataset.","language":"en","urldate":"2023-06-16","publisher":"arXiv","author":[{"propositions":[],"lastnames":["Chen"],"firstnames":["Tianqi"],"suffixes":[]},{"propositions":[],"lastnames":["Goodfellow"],"firstnames":["Ian"],"suffixes":[]},{"propositions":[],"lastnames":["Shlens"],"firstnames":["Jonathon"],"suffixes":[]}],"month":"April","year":"2016","note":"arXiv:1511.05641 [cs]","keywords":"#ICLR\\textgreater16, #NN, #Transfer, /unread, Computer Science - Machine Learning","bibtex":"@misc{chen_net2net_2016,\n\ttitle = {{Net2Net}: {Accelerating} {Learning} via {Knowledge} {Transfer}},\n\tshorttitle = {{Net2Net}},\n\turl = {http://arxiv.org/abs/1511.05641},\n\tabstract = {We introduce techniques for rapidly transferring the information stored in one neural net into another neural net. The main purpose is to accelerate the training of a signiﬁcantly larger neural net. During real-world workﬂows, one often trains very many different neural networks during the experimentation and design process. This is a wasteful process in which each new model is trained from scratch. Our Net2Net technique accelerates the experimentation process by instantaneously transferring the knowledge from a previous network to each new deeper or wider network. Our techniques are based on the concept of functionpreserving transformations between neural network speciﬁcations. This differs from previous approaches to pre-training that altered the function represented by a neural net when adding layers to it. Using our knowledge transfer mechanism to add depth to Inception modules, we demonstrate a new state of the art accuracy rating on the ImageNet dataset.},\n\tlanguage = {en},\n\turldate = {2023-06-16},\n\tpublisher = {arXiv},\n\tauthor = {Chen, Tianqi and Goodfellow, Ian and Shlens, Jonathon},\n\tmonth = apr,\n\tyear = {2016},\n\tnote = {arXiv:1511.05641 [cs]},\n\tkeywords = {\\#ICLR{\\textgreater}16, \\#NN, \\#Transfer, /unread, Computer Science - Machine Learning},\n}\n\n\n\n","author_short":["Chen, T.","Goodfellow, I.","Shlens, J."],"key":"chen_net2net_2016","id":"chen_net2net_2016","bibbaseid":"chen-goodfellow-shlens-net2netacceleratinglearningviaknowledgetransfer-2016","role":"author","urls":{"Paper":"http://arxiv.org/abs/1511.05641"},"keyword":["#ICLR\\textgreater16","#NN","#Transfer","/unread","Computer Science - Machine Learning"],"metadata":{"authorlinks":{}},"downloads":0,"html":""},"bibtype":"misc","biburl":"https://bibbase.org/zotero/zzhenry2012","dataSources":["nZHrFJKyxKKDaWYM8"],"keywords":["#iclr\\textgreater16","#nn","#transfer","/unread","computer science - machine learning"],"search_terms":["net2net","accelerating","learning","via","knowledge","transfer","chen","goodfellow","shlens"],"title":"Net2Net: Accelerating Learning via Knowledge Transfer","year":2016}