Learning Finer-class Networks for Universal Representation. Girard, J., Tamaazousti, Y., Le Borgne, H., & Hudelot, C. In British Machine Vision Conference (BMVC), Newcastle Upon Tyne (UK), 2018.
Pdf
Supp abstract bibtex Many real-world visual recognition use-cases can not directly benefit from state-of-the-art CNN-based approaches because of the lack of many annotated data. The usual approach to deal with this is to transfer a representation pre-learned on a large annotated source-task onto a target-task of interest. This raises the question of how well the original representation is “universal”, that is to say directly adapted to many different target-tasks. To improve such universality, the state-of-the-art consists in training networks on a diversified source problem, that is modified either by adding generic or specific categories to the initial set of categories. In this vein, we proposed a method that exploits finer-classes than the most specific ones existing, for which no annotation is available. We rely on unsupervised learning and a bottom-up split and merge strategy. We show that our method learns more universal representations than state-of-the-art, leading to significantly better results on 10 target-tasks from multiple domains, using several network architectures, either alone or combined with networks learned at a coarser semantic level.
@inproceedings{girard18bmvc,
author = {Girard, Julien and Tamaazousti, Youssef and Le Borgne, Herv{\'e} and C{\'e}line Hudelot},
title = {Learning Finer-class Networks for Universal Representation},
booktitle = {British Machine Vision Conference (BMVC)},
year = {2018},
address = {Newcastle Upon Tyne (UK)},
url_PDF = {http://bmvc2018.org/contents/papers/1021.pdf},
url_supp = {http://bmvc2018.org/contents/supplementary/pdf/1021_supp.pdf},
abstract = {Many real-world visual recognition use-cases can not directly benefit from state-of-the-art CNN-based approaches because of the lack of many annotated data. The usual approach to deal with this is to transfer a representation pre-learned on a large annotated source-task onto a target-task of interest. This raises the question of how well the original representation is “universal”, that is to say directly adapted to many different target-tasks. To improve such universality, the state-of-the-art consists in training networks on a diversified source problem, that is modified either by adding generic or specific categories to the initial set of categories. In this vein, we proposed a method that exploits finer-classes than the most specific ones existing, for which no annotation is available. We rely on unsupervised learning and a bottom-up split and merge strategy. We show that our method learns more universal representations than state-of-the-art, leading to significantly better results on 10 target-tasks from multiple domains, using several network architectures, either alone or combined with networks learned at a coarser semantic level.},
keywords = {frugal-learning}
}