Data Distillation: Towards Omni-Supervised Learning

Data Distillation: Towards Omni-Supervised Learning. Radosavovic, I., Dollár, P., Girshick, R., Gkioxari, G., & He, K. December, 2017. arXiv:1712.04440 [cs]

Paper doi abstract bibtex

We investigate omni-supervised learning, a special regime of semi-supervised learning in which the learner exploits all available labeled data plus internet-scale sources of unlabeled data. Omni-supervised learning is lower-bounded by performance on existing labeled datasets, offering the potential to surpass state-of-the-art fully supervised methods. To exploit the omni-supervised setting, we propose data distillation, a method that ensembles predictions from multiple transformations of unlabeled data, using a single model, to automatically generate new training annotations. We argue that visual recognition models have recently become accurate enough that it is now possible to apply classic ideas about self-training to challenging real-world data. Our experimental results show that in the cases of human keypoint detection and general object detection, state-of-the-art models trained with data distillation surpass the performance of using labeled data from the COCO dataset alone.

@misc{radosavovic_data_2017,
	title = {Data {Distillation}: {Towards} {Omni}-{Supervised} {Learning}},
	shorttitle = {Data {Distillation}},
	url = {http://arxiv.org/abs/1712.04440},
	doi = {10.48550/arXiv.1712.04440},
	abstract = {We investigate omni-supervised learning, a special regime of semi-supervised learning in which the learner exploits all available labeled data plus internet-scale sources of unlabeled data. Omni-supervised learning is lower-bounded by performance on existing labeled datasets, offering the potential to surpass state-of-the-art fully supervised methods. To exploit the omni-supervised setting, we propose data distillation, a method that ensembles predictions from multiple transformations of unlabeled data, using a single model, to automatically generate new training annotations. We argue that visual recognition models have recently become accurate enough that it is now possible to apply classic ideas about self-training to challenging real-world data. Our experimental results show that in the cases of human keypoint detection and general object detection, state-of-the-art models trained with data distillation surpass the performance of using labeled data from the COCO dataset alone.},
	language = {en},
	urldate = {2023-08-09},
	publisher = {arXiv},
	author = {Radosavovic, Ilija and Dollár, Piotr and Girshick, Ross and Gkioxari, Georgia and He, Kaiming},
	month = dec,
	year = {2017},
	note = {arXiv:1712.04440 [cs]},
	keywords = {\#CVPR{\textgreater}18, \#Deep Learning, \#Distilling, /unread, Computer Science - Computer Vision and Pattern Recognition},
}

Downloads: 0

{"_id":"yf8aCnj5E9HMcSspC","bibbaseid":"radosavovic-dollr-girshick-gkioxari-he-datadistillationtowardsomnisupervisedlearning-2017","author_short":["Radosavovic, I.","Dollár, P.","Girshick, R.","Gkioxari, G.","He, K."],"bibdata":{"bibtype":"misc","type":"misc","title":"Data Distillation: Towards Omni-Supervised Learning","shorttitle":"Data Distillation","url":"http://arxiv.org/abs/1712.04440","doi":"10.48550/arXiv.1712.04440","abstract":"We investigate omni-supervised learning, a special regime of semi-supervised learning in which the learner exploits all available labeled data plus internet-scale sources of unlabeled data. Omni-supervised learning is lower-bounded by performance on existing labeled datasets, offering the potential to surpass state-of-the-art fully supervised methods. To exploit the omni-supervised setting, we propose data distillation, a method that ensembles predictions from multiple transformations of unlabeled data, using a single model, to automatically generate new training annotations. We argue that visual recognition models have recently become accurate enough that it is now possible to apply classic ideas about self-training to challenging real-world data. Our experimental results show that in the cases of human keypoint detection and general object detection, state-of-the-art models trained with data distillation surpass the performance of using labeled data from the COCO dataset alone.","language":"en","urldate":"2023-08-09","publisher":"arXiv","author":[{"propositions":[],"lastnames":["Radosavovic"],"firstnames":["Ilija"],"suffixes":[]},{"propositions":[],"lastnames":["Dollár"],"firstnames":["Piotr"],"suffixes":[]},{"propositions":[],"lastnames":["Girshick"],"firstnames":["Ross"],"suffixes":[]},{"propositions":[],"lastnames":["Gkioxari"],"firstnames":["Georgia"],"suffixes":[]},{"propositions":[],"lastnames":["He"],"firstnames":["Kaiming"],"suffixes":[]}],"month":"December","year":"2017","note":"arXiv:1712.04440 [cs]","keywords":"#CVPR\\textgreater18, #Deep Learning, #Distilling, /unread, Computer Science - Computer Vision and Pattern Recognition","bibtex":"@misc{radosavovic_data_2017,\n\ttitle = {Data {Distillation}: {Towards} {Omni}-{Supervised} {Learning}},\n\tshorttitle = {Data {Distillation}},\n\turl = {http://arxiv.org/abs/1712.04440},\n\tdoi = {10.48550/arXiv.1712.04440},\n\tabstract = {We investigate omni-supervised learning, a special regime of semi-supervised learning in which the learner exploits all available labeled data plus internet-scale sources of unlabeled data. Omni-supervised learning is lower-bounded by performance on existing labeled datasets, offering the potential to surpass state-of-the-art fully supervised methods. To exploit the omni-supervised setting, we propose data distillation, a method that ensembles predictions from multiple transformations of unlabeled data, using a single model, to automatically generate new training annotations. We argue that visual recognition models have recently become accurate enough that it is now possible to apply classic ideas about self-training to challenging real-world data. Our experimental results show that in the cases of human keypoint detection and general object detection, state-of-the-art models trained with data distillation surpass the performance of using labeled data from the COCO dataset alone.},\n\tlanguage = {en},\n\turldate = {2023-08-09},\n\tpublisher = {arXiv},\n\tauthor = {Radosavovic, Ilija and Dollár, Piotr and Girshick, Ross and Gkioxari, Georgia and He, Kaiming},\n\tmonth = dec,\n\tyear = {2017},\n\tnote = {arXiv:1712.04440 [cs]},\n\tkeywords = {\\#CVPR{\\textgreater}18, \\#Deep Learning, \\#Distilling, /unread, Computer Science - Computer Vision and Pattern Recognition},\n}\n\n\n\n","author_short":["Radosavovic, I.","Dollár, P.","Girshick, R.","Gkioxari, G.","He, K."],"key":"radosavovic_data_2017","id":"radosavovic_data_2017","bibbaseid":"radosavovic-dollr-girshick-gkioxari-he-datadistillationtowardsomnisupervisedlearning-2017","role":"author","urls":{"Paper":"http://arxiv.org/abs/1712.04440"},"keyword":["#CVPR\\textgreater18","#Deep Learning","#Distilling","/unread","Computer Science - Computer Vision and Pattern Recognition"],"metadata":{"authorlinks":{}},"downloads":0,"html":""},"bibtype":"misc","biburl":"https://bibbase.org/zotero/zzhenry2012","dataSources":["nZHrFJKyxKKDaWYM8"],"keywords":["#cvpr\\textgreater18","#deep learning","#distilling","/unread","computer science - computer vision and pattern recognition"],"search_terms":["data","distillation","towards","omni","supervised","learning","radosavovic","dollár","girshick","gkioxari","he"],"title":"Data Distillation: Towards Omni-Supervised Learning","year":2017}