LCMV Beamformer with DNN-Based Multichannel Concurrent Speakers Detector

LCMV Beamformer with DNN-Based Multichannel Concurrent Speakers Detector. Chazan, S. E., Goldberger, J., & Gannot, S. In 2018 26th European Signal Processing Conference (EUSIPCO), pages 1562-1566, Sep., 2018.

Paper doi abstract bibtex

Application of the linearly constrained minimum variance (LCMV) beamformer (BF) to speaker extraction tasks in real-life scenarios necessitates a sophisticated control mechanism to facilitate the estimation of the noise spatial cross-power spectral density (cPSD) matrix and the relative transfer function (RTF) of all sources of interest. We propose a deep neural network (DNN)-based multichannel concurrent speakers detector (MCCSD) that utilizes all available microphone signals to detect the activity patterns of all speakers. Time frames classified as no active speaker frames will be utilized to estimate the cPSD, while time frames with a single detected speaker will be utilized for estimating the associated RTF. No estimation will take place during concurrent speaker activity. Experimental results show that the multi-channel approach significantly improves its single-channel counterpart.

@InProceedings{8553564,
  author = {S. E. Chazan and J. Goldberger and S. Gannot},
  booktitle = {2018 26th European Signal Processing Conference (EUSIPCO)},
  title = {LCMV Beamformer with DNN-Based Multichannel Concurrent Speakers Detector},
  year = {2018},
  pages = {1562-1566},
  abstract = {Application of the linearly constrained minimum variance (LCMV) beamformer (BF) to speaker extraction tasks in real-life scenarios necessitates a sophisticated control mechanism to facilitate the estimation of the noise spatial cross-power spectral density (cPSD) matrix and the relative transfer function (RTF) of all sources of interest. We propose a deep neural network (DNN)-based multichannel concurrent speakers detector (MCCSD) that utilizes all available microphone signals to detect the activity patterns of all speakers. Time frames classified as no active speaker frames will be utilized to estimate the cPSD, while time frames with a single detected speaker will be utilized for estimating the associated RTF. No estimation will take place during concurrent speaker activity. Experimental results show that the multi-channel approach significantly improves its single-channel counterpart.},
  keywords = {array signal processing;microphones;neural nets;signal classification;speaker recognition;transfer function matrices;LCMV BF;noise spatial cross-power spectral density matrix estimation;noise spatial cPSD matrix estimation;RTF;DNN-based MCCSD;microphone signal;speaker activity pattern detection;time frame classification;multichannel approach;concurrent speaker activity;single detected speaker;active speaker frames;deep neural network-based multichannel concurrent speakers detector;relative transfer function;speaker extraction tasks;linearly constrained minimum variance beamformer;Microphones;Estimation;Detectors;Databases;Dictionaries;Noise measurement;Interference},
  doi = {10.23919/EUSIPCO.2018.8553564},
  issn = {2076-1465},
  month = {Sep.},
  url = {https://www.eurasip.org/proceedings/eusipco/eusipco2018/papers/1570437111.pdf},
}

Downloads: 0

{"_id":"9aPQmJB4ajcunvwJk","bibbaseid":"chazan-goldberger-gannot-lcmvbeamformerwithdnnbasedmultichannelconcurrentspeakersdetector-2018","authorIDs":[],"author_short":["Chazan, S. E.","Goldberger, J.","Gannot, S."],"bibdata":{"bibtype":"inproceedings","type":"inproceedings","author":[{"firstnames":["S.","E."],"propositions":[],"lastnames":["Chazan"],"suffixes":[]},{"firstnames":["J."],"propositions":[],"lastnames":["Goldberger"],"suffixes":[]},{"firstnames":["S."],"propositions":[],"lastnames":["Gannot"],"suffixes":[]}],"booktitle":"2018 26th European Signal Processing Conference (EUSIPCO)","title":"LCMV Beamformer with DNN-Based Multichannel Concurrent Speakers Detector","year":"2018","pages":"1562-1566","abstract":"Application of the linearly constrained minimum variance (LCMV) beamformer (BF) to speaker extraction tasks in real-life scenarios necessitates a sophisticated control mechanism to facilitate the estimation of the noise spatial cross-power spectral density (cPSD) matrix and the relative transfer function (RTF) of all sources of interest. We propose a deep neural network (DNN)-based multichannel concurrent speakers detector (MCCSD) that utilizes all available microphone signals to detect the activity patterns of all speakers. Time frames classified as no active speaker frames will be utilized to estimate the cPSD, while time frames with a single detected speaker will be utilized for estimating the associated RTF. No estimation will take place during concurrent speaker activity. Experimental results show that the multi-channel approach significantly improves its single-channel counterpart.","keywords":"array signal processing;microphones;neural nets;signal classification;speaker recognition;transfer function matrices;LCMV BF;noise spatial cross-power spectral density matrix estimation;noise spatial cPSD matrix estimation;RTF;DNN-based MCCSD;microphone signal;speaker activity pattern detection;time frame classification;multichannel approach;concurrent speaker activity;single detected speaker;active speaker frames;deep neural network-based multichannel concurrent speakers detector;relative transfer function;speaker extraction tasks;linearly constrained minimum variance beamformer;Microphones;Estimation;Detectors;Databases;Dictionaries;Noise measurement;Interference","doi":"10.23919/EUSIPCO.2018.8553564","issn":"2076-1465","month":"Sep.","url":"https://www.eurasip.org/proceedings/eusipco/eusipco2018/papers/1570437111.pdf","bibtex":"@InProceedings{8553564,\n author = {S. E. Chazan and J. Goldberger and S. Gannot},\n booktitle = {2018 26th European Signal Processing Conference (EUSIPCO)},\n title = {LCMV Beamformer with DNN-Based Multichannel Concurrent Speakers Detector},\n year = {2018},\n pages = {1562-1566},\n abstract = {Application of the linearly constrained minimum variance (LCMV) beamformer (BF) to speaker extraction tasks in real-life scenarios necessitates a sophisticated control mechanism to facilitate the estimation of the noise spatial cross-power spectral density (cPSD) matrix and the relative transfer function (RTF) of all sources of interest. We propose a deep neural network (DNN)-based multichannel concurrent speakers detector (MCCSD) that utilizes all available microphone signals to detect the activity patterns of all speakers. Time frames classified as no active speaker frames will be utilized to estimate the cPSD, while time frames with a single detected speaker will be utilized for estimating the associated RTF. No estimation will take place during concurrent speaker activity. Experimental results show that the multi-channel approach significantly improves its single-channel counterpart.},\n keywords = {array signal processing;microphones;neural nets;signal classification;speaker recognition;transfer function matrices;LCMV BF;noise spatial cross-power spectral density matrix estimation;noise spatial cPSD matrix estimation;RTF;DNN-based MCCSD;microphone signal;speaker activity pattern detection;time frame classification;multichannel approach;concurrent speaker activity;single detected speaker;active speaker frames;deep neural network-based multichannel concurrent speakers detector;relative transfer function;speaker extraction tasks;linearly constrained minimum variance beamformer;Microphones;Estimation;Detectors;Databases;Dictionaries;Noise measurement;Interference},\n doi = {10.23919/EUSIPCO.2018.8553564},\n issn = {2076-1465},\n month = {Sep.},\n url = {https://www.eurasip.org/proceedings/eusipco/eusipco2018/papers/1570437111.pdf},\n}\n\n","author_short":["Chazan, S. E.","Goldberger, J.","Gannot, S."],"key":"8553564","id":"8553564","bibbaseid":"chazan-goldberger-gannot-lcmvbeamformerwithdnnbasedmultichannelconcurrentspeakersdetector-2018","role":"author","urls":{"Paper":"https://www.eurasip.org/proceedings/eusipco/eusipco2018/papers/1570437111.pdf"},"keyword":["array signal processing;microphones;neural nets;signal classification;speaker recognition;transfer function matrices;LCMV BF;noise spatial cross-power spectral density matrix estimation;noise spatial cPSD matrix estimation;RTF;DNN-based MCCSD;microphone signal;speaker activity pattern detection;time frame classification;multichannel approach;concurrent speaker activity;single detected speaker;active speaker frames;deep neural network-based multichannel concurrent speakers detector;relative transfer function;speaker extraction tasks;linearly constrained minimum variance beamformer;Microphones;Estimation;Detectors;Databases;Dictionaries;Noise measurement;Interference"],"metadata":{"authorlinks":{}},"downloads":0},"bibtype":"inproceedings","biburl":"https://raw.githubusercontent.com/Roznn/EUSIPCO/main/eusipco2018url.bib","creationDate":"2021-02-13T15:38:40.485Z","downloads":0,"keywords":["array signal processing;microphones;neural nets;signal classification;speaker recognition;transfer function matrices;lcmv bf;noise spatial cross-power spectral density matrix estimation;noise spatial cpsd matrix estimation;rtf;dnn-based mccsd;microphone signal;speaker activity pattern detection;time frame classification;multichannel approach;concurrent speaker activity;single detected speaker;active speaker frames;deep neural network-based multichannel concurrent speakers detector;relative transfer function;speaker extraction tasks;linearly constrained minimum variance beamformer;microphones;estimation;detectors;databases;dictionaries;noise measurement;interference"],"search_terms":["lcmv","beamformer","dnn","based","multichannel","concurrent","speakers","detector","chazan","goldberger","gannot"],"title":"LCMV Beamformer with DNN-Based Multichannel Concurrent Speakers Detector","year":2018,"dataSources":["yiZioZximP7hphDpY","iuBeKSmaES2fHcEE9"]}