Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches

Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches. Žbontar, J. & LeCun, Y. arXiv:1510.05970 [cs], October, 2015. arXiv: 1510.05970

Paper abstract bibtex

We present a method for extracting depth information from a rectified image pair. Our approach focuses on the first stage of many stereo algorithms: the matching cost computation. We approach the problem by learning a similarity measure on small image patches using a convolutional neural network. Training is carried out in a supervised manner by constructing a binary classification data set with examples of similar and dissimilar pairs of patches. We examine two network architectures for this task: one tuned for speed, the other for accuracy. The output of the convolutional neural network is used to initialize the stereo matching cost. A series of post-processing steps follow: cross-based cost aggregation, semiglobal matching, a left-right consistency check, subpixel enhancement, a median filter, and a bilateral filter. We evaluate our method on the KITTI 2012, KITTI 2015, and Middlebury stereo data sets and show that it outperforms other approaches on all three data sets.

@article{zbontar_stereo_2015,
	title = {Stereo {Matching} by {Training} a {Convolutional} {Neural} {Network} to {Compare} {Image} {Patches}},
	url = {http://arxiv.org/abs/1510.05970},
	abstract = {We present a method for extracting depth information from a rectified image pair. Our approach focuses on the first stage of many stereo algorithms: the matching cost computation. We approach the problem by learning a similarity measure on small image patches using a convolutional neural network. Training is carried out in a supervised manner by constructing a binary classification data set with examples of similar and dissimilar pairs of patches. We examine two network architectures for this task: one tuned for speed, the other for accuracy. The output of the convolutional neural network is used to initialize the stereo matching cost. A series of post-processing steps follow: cross-based cost aggregation, semiglobal matching, a left-right consistency check, subpixel enhancement, a median filter, and a bilateral filter. We evaluate our method on the KITTI 2012, KITTI 2015, and Middlebury stereo data sets and show that it outperforms other approaches on all three data sets.},
	urldate = {2018-01-12TZ},
	journal = {arXiv:1510.05970 [cs]},
	author = {Žbontar, Jure and LeCun, Yann},
	month = oct,
	year = {2015},
	note = {arXiv: 1510.05970},
	keywords = {Computer Science - Computer Vision and Pattern Recognition, Computer Science - Learning, Computer Science - Neural and Evolutionary Computing}
}

Downloads: 0

{"_id":"ZopRRWat65FNmf6RH","bibbaseid":"bontar-lecun-stereomatchingbytrainingaconvolutionalneuralnetworktocompareimagepatches-2015","downloads":0,"creationDate":"2018-04-06T04:26:07.177Z","title":"Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches","author_short":["Žbontar, J.","LeCun, Y."],"year":2015,"bibtype":"article","biburl":"https://bibbase.org/zotero/alwynmathew","bibdata":{"bibtype":"article","type":"article","title":"Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches","url":"http://arxiv.org/abs/1510.05970","abstract":"We present a method for extracting depth information from a rectified image pair. Our approach focuses on the first stage of many stereo algorithms: the matching cost computation. We approach the problem by learning a similarity measure on small image patches using a convolutional neural network. Training is carried out in a supervised manner by constructing a binary classification data set with examples of similar and dissimilar pairs of patches. We examine two network architectures for this task: one tuned for speed, the other for accuracy. The output of the convolutional neural network is used to initialize the stereo matching cost. A series of post-processing steps follow: cross-based cost aggregation, semiglobal matching, a left-right consistency check, subpixel enhancement, a median filter, and a bilateral filter. We evaluate our method on the KITTI 2012, KITTI 2015, and Middlebury stereo data sets and show that it outperforms other approaches on all three data sets.","urldate":"2018-01-12TZ","journal":"arXiv:1510.05970 [cs]","author":[{"propositions":[],"lastnames":["Žbontar"],"firstnames":["Jure"],"suffixes":[]},{"propositions":[],"lastnames":["LeCun"],"firstnames":["Yann"],"suffixes":[]}],"month":"October","year":"2015","note":"arXiv: 1510.05970","keywords":"Computer Science - Computer Vision and Pattern Recognition, Computer Science - Learning, Computer Science - Neural and Evolutionary Computing","bibtex":"@article{zbontar_stereo_2015,\n\ttitle = {Stereo {Matching} by {Training} a {Convolutional} {Neural} {Network} to {Compare} {Image} {Patches}},\n\turl = {http://arxiv.org/abs/1510.05970},\n\tabstract = {We present a method for extracting depth information from a rectified image pair. Our approach focuses on the first stage of many stereo algorithms: the matching cost computation. We approach the problem by learning a similarity measure on small image patches using a convolutional neural network. Training is carried out in a supervised manner by constructing a binary classification data set with examples of similar and dissimilar pairs of patches. We examine two network architectures for this task: one tuned for speed, the other for accuracy. The output of the convolutional neural network is used to initialize the stereo matching cost. A series of post-processing steps follow: cross-based cost aggregation, semiglobal matching, a left-right consistency check, subpixel enhancement, a median filter, and a bilateral filter. We evaluate our method on the KITTI 2012, KITTI 2015, and Middlebury stereo data sets and show that it outperforms other approaches on all three data sets.},\n\turldate = {2018-01-12TZ},\n\tjournal = {arXiv:1510.05970 [cs]},\n\tauthor = {Žbontar, Jure and LeCun, Yann},\n\tmonth = oct,\n\tyear = {2015},\n\tnote = {arXiv: 1510.05970},\n\tkeywords = {Computer Science - Computer Vision and Pattern Recognition, Computer Science - Learning, Computer Science - Neural and Evolutionary Computing}\n}\n\n","author_short":["Žbontar, J.","LeCun, Y."],"key":"zbontar_stereo_2015","id":"zbontar_stereo_2015","bibbaseid":"bontar-lecun-stereomatchingbytrainingaconvolutionalneuralnetworktocompareimagepatches-2015","role":"author","urls":{"Paper":"http://arxiv.org/abs/1510.05970"},"keyword":["Computer Science - Computer Vision and Pattern Recognition","Computer Science - Learning","Computer Science - Neural and Evolutionary Computing"],"downloads":0,"html":""},"search_terms":["stereo","matching","training","convolutional","neural","network","compare","image","patches","žbontar","lecun"],"keywords":["computer science - computer vision and pattern recognition","computer science - learning","computer science - neural and evolutionary computing"],"authorIDs":[],"dataSources":["p3JdPh89hHfoARFkn"]}