Instance Segmentation of Indoor Scenes using a Coverage Loss. Silberman, N.; Sontag, D.; and Fergus, R. In Fleet, D. J.; Pajdla, T.; Schiele, B.; and Tuytelaars, T., editors, Proceedings of the 13th European Conference on Computer Vision (ECCV), volume 8689, of Lecture Notes in Computer Science, pages 616–631, 2014. Springer.
Paper abstract bibtex A major limitation of existing models for semantic segmentation is the inability to identify individual instances of the same class: when labeling pixels with only semantic classes, a set of pixels with the same label could represent a single object or ten. In this work, we introduce a model to perform both semantic and instance segmentation simultaneously. We introduce a new higher-order loss function that directly minimizes the coverage metric and evaluate a variety of region features, including those from a convolutional network. We apply our model to the NYU Depth V2 dataset, obtaining state of the art results.
@inproceedings{SilSonFer_ECCV14,
author = {Nathan Silberman and David Sontag and Rob Fergus},
title = {Instance Segmentation of Indoor Scenes using a Coverage Loss},
booktitle = {Proceedings of the 13th European Conference on Computer Vision (ECCV)},
series = {Lecture Notes in Computer Science},
volume = {8689},
publisher = {Springer},
editor = {David J. Fleet and
Tom{\'{a}}s Pajdla and
Bernt Schiele and
Tinne Tuytelaars},
pages = {616--631},
year = {2014},
keywords = {Computer vision, Machine learning},
url_Paper = {http://people.csail.mit.edu/dsontag/papers/SilSonFer_ECCV14.pdf},
abstract = {A major limitation of existing models for semantic segmentation is the inability to identify individual instances of the same class: when labeling pixels with only semantic classes, a set of pixels with the same label could represent a single object or ten. In this work, we introduce a model to perform both semantic and instance segmentation simultaneously. We introduce a new higher-order loss function that directly minimizes the coverage metric and evaluate a variety of region features, including those from a convolutional network. We apply our model to the NYU Depth V2 dataset, obtaining state of the art results.}
}