Net2Net: Accelerating Learning via Knowledge Transfer. Chen, T., Goodfellow, I., & Shlens, J. April, 2016. arXiv:1511.05641 [cs]
Net2Net: Accelerating Learning via Knowledge Transfer [link]Paper  abstract   bibtex   
We introduce techniques for rapidly transferring the information stored in one neural net into another neural net. The main purpose is to accelerate the training of a significantly larger neural net. During real-world workflows, one often trains very many different neural networks during the experimentation and design process. This is a wasteful process in which each new model is trained from scratch. Our Net2Net technique accelerates the experimentation process by instantaneously transferring the knowledge from a previous network to each new deeper or wider network. Our techniques are based on the concept of functionpreserving transformations between neural network specifications. This differs from previous approaches to pre-training that altered the function represented by a neural net when adding layers to it. Using our knowledge transfer mechanism to add depth to Inception modules, we demonstrate a new state of the art accuracy rating on the ImageNet dataset.
@misc{chen_net2net_2016,
	title = {{Net2Net}: {Accelerating} {Learning} via {Knowledge} {Transfer}},
	shorttitle = {{Net2Net}},
	url = {http://arxiv.org/abs/1511.05641},
	abstract = {We introduce techniques for rapidly transferring the information stored in one neural net into another neural net. The main purpose is to accelerate the training of a significantly larger neural net. During real-world workflows, one often trains very many different neural networks during the experimentation and design process. This is a wasteful process in which each new model is trained from scratch. Our Net2Net technique accelerates the experimentation process by instantaneously transferring the knowledge from a previous network to each new deeper or wider network. Our techniques are based on the concept of functionpreserving transformations between neural network specifications. This differs from previous approaches to pre-training that altered the function represented by a neural net when adding layers to it. Using our knowledge transfer mechanism to add depth to Inception modules, we demonstrate a new state of the art accuracy rating on the ImageNet dataset.},
	language = {en},
	urldate = {2023-06-16},
	publisher = {arXiv},
	author = {Chen, Tianqi and Goodfellow, Ian and Shlens, Jonathon},
	month = apr,
	year = {2016},
	note = {arXiv:1511.05641 [cs]},
	keywords = {\#ICLR{\textgreater}16, \#NN, \#Transfer, /unread, Computer Science - Machine Learning},
}

Downloads: 0