Multichannel Audio Front-End for Far-Field Automatic Speech Recognition

Multichannel Audio Front-End for Far-Field Automatic Speech Recognition. Chhetri, A., Hilmes, P., Kristjansson, T., Chu, W., Mansour, M., Li, X., & Zhang, X. In 2018 26th European Signal Processing Conference (EUSIPCO), pages 1527-1531, Sep., 2018.

Paper doi abstract bibtex

Far-field automatic speech recognition (ASR) is a key enabling technology that allows untethered and natural voice interaction between users and Amazon Echo family of products. A key component in realizing far-field ASR on these products is the suite of audio front-end (AFE) algorithms that helps in mitigating acoustic environmental challenges and thereby improving the ASR performance. In this paper, we discuss the key algorithms within the AFE, and we provide insights into how these algorithms help in mitigating the various acoustical challenges for far-field processing. We also provide insights into the audio algorithm architecture adopted for the AFE, and we discuss ongoing and future research.

@InProceedings{8553149,
  author = {A. Chhetri and P. Hilmes and T. Kristjansson and W. Chu and M. Mansour and X. Li and X. Zhang},
  booktitle = {2018 26th European Signal Processing Conference (EUSIPCO)},
  title = {Multichannel Audio Front-End for Far-Field Automatic Speech Recognition},
  year = {2018},
  pages = {1527-1531},
  abstract = {Far-field automatic speech recognition (ASR) is a key enabling technology that allows untethered and natural voice interaction between users and Amazon Echo family of products. A key component in realizing far-field ASR on these products is the suite of audio front-end (AFE) algorithms that helps in mitigating acoustic environmental challenges and thereby improving the ASR performance. In this paper, we discuss the key algorithms within the AFE, and we provide insights into how these algorithms help in mitigating the various acoustical challenges for far-field processing. We also provide insights into the audio algorithm architecture adopted for the AFE, and we discuss ongoing and future research.},
  keywords = {acoustic signal processing;neural nets;speech recognition;audio algorithm architecture;far-field automatic speech recognition;natural voice interaction;far-field ASR;audio front-end algorithms;Amazon echo family;mitigating acoustic environment;multichannel audio front-end;AFE algorithms;deep neural networks;Acoustics;Signal processing algorithms;Array signal processing;Engines;Measurement;Microphone arrays;Beamforming;far-field;AFE;deep neural networks;ASR;Amazon Echo},
  doi = {10.23919/EUSIPCO.2018.8553149},
  issn = {2076-1465},
  month = {Sep.},
  url = {https://www.eurasip.org/proceedings/eusipco/eusipco2018/papers/1570431613.pdf},
}

Downloads: 0

{"_id":"tFyAR65YjAuXAjDJ6","bibbaseid":"chhetri-hilmes-kristjansson-chu-mansour-li-zhang-multichannelaudiofrontendforfarfieldautomaticspeechrecognition-2018","authorIDs":[],"author_short":["Chhetri, A.","Hilmes, P.","Kristjansson, T.","Chu, W.","Mansour, M.","Li, X.","Zhang, X."],"bibdata":{"bibtype":"inproceedings","type":"inproceedings","author":[{"firstnames":["A."],"propositions":[],"lastnames":["Chhetri"],"suffixes":[]},{"firstnames":["P."],"propositions":[],"lastnames":["Hilmes"],"suffixes":[]},{"firstnames":["T."],"propositions":[],"lastnames":["Kristjansson"],"suffixes":[]},{"firstnames":["W."],"propositions":[],"lastnames":["Chu"],"suffixes":[]},{"firstnames":["M."],"propositions":[],"lastnames":["Mansour"],"suffixes":[]},{"firstnames":["X."],"propositions":[],"lastnames":["Li"],"suffixes":[]},{"firstnames":["X."],"propositions":[],"lastnames":["Zhang"],"suffixes":[]}],"booktitle":"2018 26th European Signal Processing Conference (EUSIPCO)","title":"Multichannel Audio Front-End for Far-Field Automatic Speech Recognition","year":"2018","pages":"1527-1531","abstract":"Far-field automatic speech recognition (ASR) is a key enabling technology that allows untethered and natural voice interaction between users and Amazon Echo family of products. A key component in realizing far-field ASR on these products is the suite of audio front-end (AFE) algorithms that helps in mitigating acoustic environmental challenges and thereby improving the ASR performance. In this paper, we discuss the key algorithms within the AFE, and we provide insights into how these algorithms help in mitigating the various acoustical challenges for far-field processing. We also provide insights into the audio algorithm architecture adopted for the AFE, and we discuss ongoing and future research.","keywords":"acoustic signal processing;neural nets;speech recognition;audio algorithm architecture;far-field automatic speech recognition;natural voice interaction;far-field ASR;audio front-end algorithms;Amazon echo family;mitigating acoustic environment;multichannel audio front-end;AFE algorithms;deep neural networks;Acoustics;Signal processing algorithms;Array signal processing;Engines;Measurement;Microphone arrays;Beamforming;far-field;AFE;deep neural networks;ASR;Amazon Echo","doi":"10.23919/EUSIPCO.2018.8553149","issn":"2076-1465","month":"Sep.","url":"https://www.eurasip.org/proceedings/eusipco/eusipco2018/papers/1570431613.pdf","bibtex":"@InProceedings{8553149,\n author = {A. Chhetri and P. Hilmes and T. Kristjansson and W. Chu and M. Mansour and X. Li and X. Zhang},\n booktitle = {2018 26th European Signal Processing Conference (EUSIPCO)},\n title = {Multichannel Audio Front-End for Far-Field Automatic Speech Recognition},\n year = {2018},\n pages = {1527-1531},\n abstract = {Far-field automatic speech recognition (ASR) is a key enabling technology that allows untethered and natural voice interaction between users and Amazon Echo family of products. A key component in realizing far-field ASR on these products is the suite of audio front-end (AFE) algorithms that helps in mitigating acoustic environmental challenges and thereby improving the ASR performance. In this paper, we discuss the key algorithms within the AFE, and we provide insights into how these algorithms help in mitigating the various acoustical challenges for far-field processing. We also provide insights into the audio algorithm architecture adopted for the AFE, and we discuss ongoing and future research.},\n keywords = {acoustic signal processing;neural nets;speech recognition;audio algorithm architecture;far-field automatic speech recognition;natural voice interaction;far-field ASR;audio front-end algorithms;Amazon echo family;mitigating acoustic environment;multichannel audio front-end;AFE algorithms;deep neural networks;Acoustics;Signal processing algorithms;Array signal processing;Engines;Measurement;Microphone arrays;Beamforming;far-field;AFE;deep neural networks;ASR;Amazon Echo},\n doi = {10.23919/EUSIPCO.2018.8553149},\n issn = {2076-1465},\n month = {Sep.},\n url = {https://www.eurasip.org/proceedings/eusipco/eusipco2018/papers/1570431613.pdf},\n}\n\n","author_short":["Chhetri, A.","Hilmes, P.","Kristjansson, T.","Chu, W.","Mansour, M.","Li, X.","Zhang, X."],"key":"8553149","id":"8553149","bibbaseid":"chhetri-hilmes-kristjansson-chu-mansour-li-zhang-multichannelaudiofrontendforfarfieldautomaticspeechrecognition-2018","role":"author","urls":{"Paper":"https://www.eurasip.org/proceedings/eusipco/eusipco2018/papers/1570431613.pdf"},"keyword":["acoustic signal processing;neural nets;speech recognition;audio algorithm architecture;far-field automatic speech recognition;natural voice interaction;far-field ASR;audio front-end algorithms;Amazon echo family;mitigating acoustic environment;multichannel audio front-end;AFE algorithms;deep neural networks;Acoustics;Signal processing algorithms;Array signal processing;Engines;Measurement;Microphone arrays;Beamforming;far-field;AFE;deep neural networks;ASR;Amazon Echo"],"metadata":{"authorlinks":{}},"downloads":0},"bibtype":"inproceedings","biburl":"https://raw.githubusercontent.com/Roznn/EUSIPCO/main/eusipco2018url.bib","creationDate":"2021-02-13T15:38:40.228Z","downloads":0,"keywords":["acoustic signal processing;neural nets;speech recognition;audio algorithm architecture;far-field automatic speech recognition;natural voice interaction;far-field asr;audio front-end algorithms;amazon echo family;mitigating acoustic environment;multichannel audio front-end;afe algorithms;deep neural networks;acoustics;signal processing algorithms;array signal processing;engines;measurement;microphone arrays;beamforming;far-field;afe;deep neural networks;asr;amazon echo"],"search_terms":["multichannel","audio","front","end","far","field","automatic","speech","recognition","chhetri","hilmes","kristjansson","chu","mansour","li","zhang"],"title":"Multichannel Audio Front-End for Far-Field Automatic Speech Recognition","year":2018,"dataSources":["yiZioZximP7hphDpY","iuBeKSmaES2fHcEE9"]}