Towards Domain Independence in CNN-based Acoustic Localization using Deep Cross Correlations

Towards Domain Independence in CNN-based Acoustic Localization using Deep Cross Correlations. Vera-Diaz, J. M., Pizarro, D., & Macias-Guarasa, J. In 2020 28th European Signal Processing Conference (EUSIPCO), pages 226-230, Aug, 2020.

Paper doi abstract bibtex

Time delay estimation is essential in Acoustic Source Localization (ASL) systems. One of the most used techniques for this purpose is the Generalized Cross Correlation (GCC) between a pair of signals and its use in Steered Response Power (SRP) techniques, which estimate the acoustic power at a specific location. Nowadays, Deep Learning strategies may outperform these methods. However, they are generally dependent on the geometric and sensor configuration conditions that are available during the training phases, thus having limited generalization capabilities when facing new environments if no re-training nor adaptation is applied. In this work, we propose a method based on an encoder-decoder CNN architecture capable of outperforming the well known SRP-PHAT algorithm, and also other Deep Learning strategies when working in mismatched training-testing conditions without requiring a model re-training. Our proposal aims to estimate a smoothed version of the correlation signals, that is then used to generate a refined acoustic power map, which leads to better performance on the ASL task. Our experimental evaluation uses three publicly available realistic datasets and provides a comparison with the SRP-PHAT algorithm and other recent proposals based on Deep Learning.

@InProceedings{9287466,
  author = {J. M. Vera-Diaz and D. Pizarro and J. Macias-Guarasa},
  booktitle = {2020 28th European Signal Processing Conference (EUSIPCO)},
  title = {Towards Domain Independence in CNN-based Acoustic Localization using Deep Cross Correlations},
  year = {2020},
  pages = {226-230},
  abstract = {Time delay estimation is essential in Acoustic Source Localization (ASL) systems. One of the most used techniques for this purpose is the Generalized Cross Correlation (GCC) between a pair of signals and its use in Steered Response Power (SRP) techniques, which estimate the acoustic power at a specific location. Nowadays, Deep Learning strategies may outperform these methods. However, they are generally dependent on the geometric and sensor configuration conditions that are available during the training phases, thus having limited generalization capabilities when facing new environments if no re-training nor adaptation is applied. In this work, we propose a method based on an encoder-decoder CNN architecture capable of outperforming the well known SRP-PHAT algorithm, and also other Deep Learning strategies when working in mismatched training-testing conditions without requiring a model re-training. Our proposal aims to estimate a smoothed version of the correlation signals, that is then used to generate a refined acoustic power map, which leads to better performance on the ASL task. Our experimental evaluation uses three publicly available realistic datasets and provides a comparison with the SRP-PHAT algorithm and other recent proposals based on Deep Learning.},
  keywords = {Deep learning;Training;Correlation;Acoustics;Proposals;Task analysis;Microphones;Acoustic Source Localization;Generalized Cross Correlation;Steered Response Power;Convolutional Neural Networks;Deep Learning},
  doi = {10.23919/Eusipco47968.2020.9287466},
  issn = {2076-1465},
  month = {Aug},
  url = {https://www.eurasip.org/proceedings/eusipco/eusipco2020/pdfs/0000226.pdf},
}

Downloads: 0

{"_id":"9M5B48ksQEHKQW6Bj","bibbaseid":"veradiaz-pizarro-maciasguarasa-towardsdomainindependenceincnnbasedacousticlocalizationusingdeepcrosscorrelations-2020","authorIDs":[],"author_short":["Vera-Diaz, J. M.","Pizarro, D.","Macias-Guarasa, J."],"bibdata":{"bibtype":"inproceedings","type":"inproceedings","author":[{"firstnames":["J.","M."],"propositions":[],"lastnames":["Vera-Diaz"],"suffixes":[]},{"firstnames":["D."],"propositions":[],"lastnames":["Pizarro"],"suffixes":[]},{"firstnames":["J."],"propositions":[],"lastnames":["Macias-Guarasa"],"suffixes":[]}],"booktitle":"2020 28th European Signal Processing Conference (EUSIPCO)","title":"Towards Domain Independence in CNN-based Acoustic Localization using Deep Cross Correlations","year":"2020","pages":"226-230","abstract":"Time delay estimation is essential in Acoustic Source Localization (ASL) systems. One of the most used techniques for this purpose is the Generalized Cross Correlation (GCC) between a pair of signals and its use in Steered Response Power (SRP) techniques, which estimate the acoustic power at a specific location. Nowadays, Deep Learning strategies may outperform these methods. However, they are generally dependent on the geometric and sensor configuration conditions that are available during the training phases, thus having limited generalization capabilities when facing new environments if no re-training nor adaptation is applied. In this work, we propose a method based on an encoder-decoder CNN architecture capable of outperforming the well known SRP-PHAT algorithm, and also other Deep Learning strategies when working in mismatched training-testing conditions without requiring a model re-training. Our proposal aims to estimate a smoothed version of the correlation signals, that is then used to generate a refined acoustic power map, which leads to better performance on the ASL task. Our experimental evaluation uses three publicly available realistic datasets and provides a comparison with the SRP-PHAT algorithm and other recent proposals based on Deep Learning.","keywords":"Deep learning;Training;Correlation;Acoustics;Proposals;Task analysis;Microphones;Acoustic Source Localization;Generalized Cross Correlation;Steered Response Power;Convolutional Neural Networks;Deep Learning","doi":"10.23919/Eusipco47968.2020.9287466","issn":"2076-1465","month":"Aug","url":"https://www.eurasip.org/proceedings/eusipco/eusipco2020/pdfs/0000226.pdf","bibtex":"@InProceedings{9287466,\n author = {J. M. Vera-Diaz and D. Pizarro and J. Macias-Guarasa},\n booktitle = {2020 28th European Signal Processing Conference (EUSIPCO)},\n title = {Towards Domain Independence in CNN-based Acoustic Localization using Deep Cross Correlations},\n year = {2020},\n pages = {226-230},\n abstract = {Time delay estimation is essential in Acoustic Source Localization (ASL) systems. One of the most used techniques for this purpose is the Generalized Cross Correlation (GCC) between a pair of signals and its use in Steered Response Power (SRP) techniques, which estimate the acoustic power at a specific location. Nowadays, Deep Learning strategies may outperform these methods. However, they are generally dependent on the geometric and sensor configuration conditions that are available during the training phases, thus having limited generalization capabilities when facing new environments if no re-training nor adaptation is applied. In this work, we propose a method based on an encoder-decoder CNN architecture capable of outperforming the well known SRP-PHAT algorithm, and also other Deep Learning strategies when working in mismatched training-testing conditions without requiring a model re-training. Our proposal aims to estimate a smoothed version of the correlation signals, that is then used to generate a refined acoustic power map, which leads to better performance on the ASL task. Our experimental evaluation uses three publicly available realistic datasets and provides a comparison with the SRP-PHAT algorithm and other recent proposals based on Deep Learning.},\n keywords = {Deep learning;Training;Correlation;Acoustics;Proposals;Task analysis;Microphones;Acoustic Source Localization;Generalized Cross Correlation;Steered Response Power;Convolutional Neural Networks;Deep Learning},\n doi = {10.23919/Eusipco47968.2020.9287466},\n issn = {2076-1465},\n month = {Aug},\n url = {https://www.eurasip.org/proceedings/eusipco/eusipco2020/pdfs/0000226.pdf},\n}\n\n","author_short":["Vera-Diaz, J. M.","Pizarro, D.","Macias-Guarasa, J."],"key":"9287466","id":"9287466","bibbaseid":"veradiaz-pizarro-maciasguarasa-towardsdomainindependenceincnnbasedacousticlocalizationusingdeepcrosscorrelations-2020","role":"author","urls":{"Paper":"https://www.eurasip.org/proceedings/eusipco/eusipco2020/pdfs/0000226.pdf"},"keyword":["Deep learning;Training;Correlation;Acoustics;Proposals;Task analysis;Microphones;Acoustic Source Localization;Generalized Cross Correlation;Steered Response Power;Convolutional Neural Networks;Deep Learning"],"metadata":{"authorlinks":{}},"downloads":0},"bibtype":"inproceedings","biburl":"https://raw.githubusercontent.com/Roznn/EUSIPCO/main/eusipco2020url.bib","creationDate":"2021-02-13T19:41:51.337Z","downloads":0,"keywords":["deep learning;training;correlation;acoustics;proposals;task analysis;microphones;acoustic source localization;generalized cross correlation;steered response power;convolutional neural networks;deep learning"],"search_terms":["towards","domain","independence","cnn","based","acoustic","localization","using","deep","cross","correlations","vera-diaz","pizarro","macias-guarasa"],"title":"Towards Domain Independence in CNN-based Acoustic Localization using Deep Cross Correlations","year":2020,"dataSources":["wXzutN6o5hxayPKdC","NBHz6C7PWuqwYyaqa"]}