D2-net: A trainable CNN for joint description and detection of local features

D2-net: A trainable CNN for joint description and detection of local features. Dusmanu, M., Rocco, I., Pajdla, T., Pollefeys, M., Sivic, J., Torii, A., & Sattler, T. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019-June:8084-8093, 2019.

Paper doi abstract bibtex

In this work we address the problem of finding reliable pixel-level correspondences under difficult imaging conditions. We propose an approach where a single convolutional neural network plays a dual role: It is simultaneously a dense feature descriptor and a feature detector. By postponing the detection to a later stage, the obtained keypoints are more stable than their traditional counterparts based on early detection of low-level structures. We show that this model can be trained using pixel correspondences extracted from readily available large-scale SfM reconstructions, without any further annotations. The proposed method obtains state-of-the-art performance on both the difficult Aachen Day-Night localization dataset and the InLoc indoor localization benchmark, as well as competitive performance on other benchmarks for image matching and 3D reconstruction.

@article{
 title = {D2-net: A trainable CNN for joint description and detection of local features},
 type = {article},
 year = {2019},
 keywords = {3D from Multiview and Sensors,Categorization,Deep Learning,Low-level Vision,Recognition: Detection,Retrieval},
 pages = {8084-8093},
 volume = {2019-June},
 id = {57e66848-24ac-3d3e-b2e8-53b767bd9acf},
 created = {2022-09-19T10:49:11.425Z},
 file_attached = {true},
 profile_id = {276016a7-2c9d-3507-8888-093db7c54774},
 group_id = {5ec9cc91-a5d6-3de5-82f3-3ef3d98a89c1},
 last_modified = {2022-09-26T08:35:53.611Z},
 read = {true},
 starred = {false},
 authored = {false},
 confirmed = {true},
 hidden = {false},
 folder_uuids = {02fb5526-03ff-44ad-8d5c-42bd496c3100},
 private_publication = {false},
 abstract = {In this work we address the problem of finding reliable pixel-level correspondences under difficult imaging conditions. We propose an approach where a single convolutional neural network plays a dual role: It is simultaneously a dense feature descriptor and a feature detector. By postponing the detection to a later stage, the obtained keypoints are more stable than their traditional counterparts based on early detection of low-level structures. We show that this model can be trained using pixel correspondences extracted from readily available large-scale SfM reconstructions, without any further annotations. The proposed method obtains state-of-the-art performance on both the difficult Aachen Day-Night localization dataset and the InLoc indoor localization benchmark, as well as competitive performance on other benchmarks for image matching and 3D reconstruction.},
 bibtype = {article},
 author = {Dusmanu, Mihai and Rocco, Ignacio and Pajdla, Tomas and Pollefeys, Marc and Sivic, Josef and Torii, Akihiko and Sattler, Torsten},
 doi = {10.1109/CVPR.2019.00828},
 journal = {Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition}
}

Downloads: 0

{"_id":"YFuYEJZfM26PWedpC","bibbaseid":"dusmanu-rocco-pajdla-pollefeys-sivic-torii-sattler-d2netatrainablecnnforjointdescriptionanddetectionoflocalfeatures-2019","author_short":["Dusmanu, M.","Rocco, I.","Pajdla, T.","Pollefeys, M.","Sivic, J.","Torii, A.","Sattler, T."],"bibdata":{"title":"D2-net: A trainable CNN for joint description and detection of local features","type":"article","year":"2019","keywords":"3D from Multiview and Sensors,Categorization,Deep Learning,Low-level Vision,Recognition: Detection,Retrieval","pages":"8084-8093","volume":"2019-June","id":"57e66848-24ac-3d3e-b2e8-53b767bd9acf","created":"2022-09-19T10:49:11.425Z","file_attached":"true","profile_id":"276016a7-2c9d-3507-8888-093db7c54774","group_id":"5ec9cc91-a5d6-3de5-82f3-3ef3d98a89c1","last_modified":"2022-09-26T08:35:53.611Z","read":"true","starred":false,"authored":false,"confirmed":"true","hidden":false,"folder_uuids":"02fb5526-03ff-44ad-8d5c-42bd496c3100","private_publication":false,"abstract":"In this work we address the problem of finding reliable pixel-level correspondences under difficult imaging conditions. We propose an approach where a single convolutional neural network plays a dual role: It is simultaneously a dense feature descriptor and a feature detector. By postponing the detection to a later stage, the obtained keypoints are more stable than their traditional counterparts based on early detection of low-level structures. We show that this model can be trained using pixel correspondences extracted from readily available large-scale SfM reconstructions, without any further annotations. The proposed method obtains state-of-the-art performance on both the difficult Aachen Day-Night localization dataset and the InLoc indoor localization benchmark, as well as competitive performance on other benchmarks for image matching and 3D reconstruction.","bibtype":"article","author":"Dusmanu, Mihai and Rocco, Ignacio and Pajdla, Tomas and Pollefeys, Marc and Sivic, Josef and Torii, Akihiko and Sattler, Torsten","doi":"10.1109/CVPR.2019.00828","journal":"Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition","bibtex":"@article{\n title = {D2-net: A trainable CNN for joint description and detection of local features},\n type = {article},\n year = {2019},\n keywords = {3D from Multiview and Sensors,Categorization,Deep Learning,Low-level Vision,Recognition: Detection,Retrieval},\n pages = {8084-8093},\n volume = {2019-June},\n id = {57e66848-24ac-3d3e-b2e8-53b767bd9acf},\n created = {2022-09-19T10:49:11.425Z},\n file_attached = {true},\n profile_id = {276016a7-2c9d-3507-8888-093db7c54774},\n group_id = {5ec9cc91-a5d6-3de5-82f3-3ef3d98a89c1},\n last_modified = {2022-09-26T08:35:53.611Z},\n read = {true},\n starred = {false},\n authored = {false},\n confirmed = {true},\n hidden = {false},\n folder_uuids = {02fb5526-03ff-44ad-8d5c-42bd496c3100},\n private_publication = {false},\n abstract = {In this work we address the problem of finding reliable pixel-level correspondences under difficult imaging conditions. We propose an approach where a single convolutional neural network plays a dual role: It is simultaneously a dense feature descriptor and a feature detector. By postponing the detection to a later stage, the obtained keypoints are more stable than their traditional counterparts based on early detection of low-level structures. We show that this model can be trained using pixel correspondences extracted from readily available large-scale SfM reconstructions, without any further annotations. The proposed method obtains state-of-the-art performance on both the difficult Aachen Day-Night localization dataset and the InLoc indoor localization benchmark, as well as competitive performance on other benchmarks for image matching and 3D reconstruction.},\n bibtype = {article},\n author = {Dusmanu, Mihai and Rocco, Ignacio and Pajdla, Tomas and Pollefeys, Marc and Sivic, Josef and Torii, Akihiko and Sattler, Torsten},\n doi = {10.1109/CVPR.2019.00828},\n journal = {Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition}\n}","author_short":["Dusmanu, M.","Rocco, I.","Pajdla, T.","Pollefeys, M.","Sivic, J.","Torii, A.","Sattler, T."],"urls":{"Paper":"https://bibbase.org/service/mendeley/bfbbf840-4c42-3914-a463-19024f50b30c/file/0d15aa29-641f-6d8f-2179-2a2ae48e00be/190503561.pdf.pdf"},"biburl":"https://bibbase.org/service/mendeley/bfbbf840-4c42-3914-a463-19024f50b30c","bibbaseid":"dusmanu-rocco-pajdla-pollefeys-sivic-torii-sattler-d2netatrainablecnnforjointdescriptionanddetectionoflocalfeatures-2019","role":"author","keyword":["3D from Multiview and Sensors","Categorization","Deep Learning","Low-level Vision","Recognition: Detection","Retrieval"],"metadata":{"authorlinks":{}},"downloads":0},"bibtype":"article","biburl":"https://bibbase.org/service/mendeley/bfbbf840-4c42-3914-a463-19024f50b30c","dataSources":["2252seNhipfTmjEBQ"],"keywords":["3d from multiview and sensors","categorization","deep learning","low-level vision","recognition: detection","retrieval"],"search_terms":["net","trainable","cnn","joint","description","detection","local","features","dusmanu","rocco","pajdla","pollefeys","sivic","torii","sattler"],"title":"D2-net: A trainable CNN for joint description and detection of local features","year":2019}