Knowledge Distillation. 00146
Knowledge Distillation [link]Paper  abstract   bibtex   
This is a presentation of the paper from Stanford’s University: BAM! Born-Again Multi-Task Networks for Natural Language Understanding Link: https://bit.ly/2LkJkTS and other earlier references about the Knowledge Distillation topic. A simple way to improve the performance of almost any machine learning algorithm is to train many different models on the same data and then to average their predictions. Unfortunately, making predictions using a whole ensemble of models is cumbersome and may be too computationally expensive to allow deployment to a large number of users, especially if the individual models are large neural nets. The main idea behind knowledge distillation is to distill these large (teacher) models into smaller yet almost as efficient and more production-friendly (student) models.
@misc{noauthor_knowledge_nodate,
	title = {Knowledge {Distillation}},
	url = {https://www.youtube.com/watch?v=b3zf-JylUus},
	abstract = {This is a presentation of the paper from Stanford’s University:

BAM! Born-Again Multi-Task Networks for Natural Language Understanding
Link: https://bit.ly/2LkJkTS

and other earlier references about the Knowledge Distillation topic.

A simple way to improve the performance of almost any machine learning algorithm is to train many different models on the same data and then to average their predictions. Unfortunately, making predictions using a whole ensemble of models is cumbersome and may be too computationally expensive to allow deployment to a large number of users, especially if the individual models are large neural nets.

The main idea behind knowledge distillation is to distill these large (teacher) models into smaller yet almost as efficient and more production-friendly (student) models.},
	urldate = {2019-11-18},
	note = {00146}
}

Downloads: 0