Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches. Žbontar, J. & LeCun, Y. arXiv:1510.05970 [cs], October, 2015. arXiv: 1510.05970
Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches [link]Paper  abstract   bibtex   
We present a method for extracting depth information from a rectified image pair. Our approach focuses on the first stage of many stereo algorithms: the matching cost computation. We approach the problem by learning a similarity measure on small image patches using a convolutional neural network. Training is carried out in a supervised manner by constructing a binary classification data set with examples of similar and dissimilar pairs of patches. We examine two network architectures for this task: one tuned for speed, the other for accuracy. The output of the convolutional neural network is used to initialize the stereo matching cost. A series of post-processing steps follow: cross-based cost aggregation, semiglobal matching, a left-right consistency check, subpixel enhancement, a median filter, and a bilateral filter. We evaluate our method on the KITTI 2012, KITTI 2015, and Middlebury stereo data sets and show that it outperforms other approaches on all three data sets.
@article{zbontar_stereo_2015,
	title = {Stereo {Matching} by {Training} a {Convolutional} {Neural} {Network} to {Compare} {Image} {Patches}},
	url = {http://arxiv.org/abs/1510.05970},
	abstract = {We present a method for extracting depth information from a rectified image pair. Our approach focuses on the first stage of many stereo algorithms: the matching cost computation. We approach the problem by learning a similarity measure on small image patches using a convolutional neural network. Training is carried out in a supervised manner by constructing a binary classification data set with examples of similar and dissimilar pairs of patches. We examine two network architectures for this task: one tuned for speed, the other for accuracy. The output of the convolutional neural network is used to initialize the stereo matching cost. A series of post-processing steps follow: cross-based cost aggregation, semiglobal matching, a left-right consistency check, subpixel enhancement, a median filter, and a bilateral filter. We evaluate our method on the KITTI 2012, KITTI 2015, and Middlebury stereo data sets and show that it outperforms other approaches on all three data sets.},
	urldate = {2018-01-12TZ},
	journal = {arXiv:1510.05970 [cs]},
	author = {Žbontar, Jure and LeCun, Yann},
	month = oct,
	year = {2015},
	note = {arXiv: 1510.05970},
	keywords = {Computer Science - Computer Vision and Pattern Recognition, Computer Science - Learning, Computer Science - Neural and Evolutionary Computing}
}

Downloads: 0