Glottal mixture model (GLOMM) for speaker identification on telephone channels

Glottal mixture model (GLOMM) for speaker identification on telephone channels. Baggenstoss, P. M., Wilkinghoff, K., & Kurth, F. In 2017 25th European Signal Processing Conference (EUSIPCO), pages 2734-2738, Aug, 2017.

Paper doi abstract bibtex

The Glottal Mixture Model (GLOMM) extracts speaker-dependent voice source information from speech data. It has previously been shown to provide speaker identification performance on clean speech comparable to universal background model (UBM), a state of the art method based on MFCC. And, when combined with UBM, the error rate was reduced by a factor of three, showing that the voice source information is largely independent of the information contained in the MFCC, yet holds as much speaker-related information. We now describe how GLOMM can be adapted for telephone quality audio and provide significant error reduction when combined with UBM and I-vector approaches. We demonstrate a factor of two error reduction on the NTIMIT data set with respect to the best published results.

@InProceedings{8081708,
  author = {P. M. Baggenstoss and K. Wilkinghoff and F. Kurth},
  booktitle = {2017 25th European Signal Processing Conference (EUSIPCO)},
  title = {Glottal mixture model (GLOMM) for speaker identification on telephone channels},
  year = {2017},
  pages = {2734-2738},
  abstract = {The Glottal Mixture Model (GLOMM) extracts speaker-dependent voice source information from speech data. It has previously been shown to provide speaker identification performance on clean speech comparable to universal background model (UBM), a state of the art method based on MFCC. And, when combined with UBM, the error rate was reduced by a factor of three, showing that the voice source information is largely independent of the information contained in the MFCC, yet holds as much speaker-related information. We now describe how GLOMM can be adapted for telephone quality audio and provide significant error reduction when combined with UBM and I-vector approaches. We demonstrate a factor of two error reduction on the NTIMIT data set with respect to the best published results.},
  keywords = {feature extraction;mixture models;speaker recognition;GLOMM;telephone channels;speaker-dependent voice source information;speech data;speaker identification performance;MFCC;error rate;telephone quality audio;error reduction;glottal mixture model;speaker-related information;I-vector approach;UBM approach;NTIMIT data set;Mel frequency cepstral coefficient;Speech;Feature extraction;Signal processing algorithms;Telephone sets;Europe},
  doi = {10.23919/EUSIPCO.2017.8081708},
  issn = {2076-1465},
  month = {Aug},
  url = {https://www.eurasip.org/proceedings/eusipco/eusipco2017/papers/1570343401.pdf},
}

Downloads: 0

{"_id":"8RdJX5yeucfxPJcG2","bibbaseid":"baggenstoss-wilkinghoff-kurth-glottalmixturemodelglommforspeakeridentificationontelephonechannels-2017","authorIDs":[],"author_short":["Baggenstoss, P. M.","Wilkinghoff, K.","Kurth, F."],"bibdata":{"bibtype":"inproceedings","type":"inproceedings","author":[{"firstnames":["P.","M."],"propositions":[],"lastnames":["Baggenstoss"],"suffixes":[]},{"firstnames":["K."],"propositions":[],"lastnames":["Wilkinghoff"],"suffixes":[]},{"firstnames":["F."],"propositions":[],"lastnames":["Kurth"],"suffixes":[]}],"booktitle":"2017 25th European Signal Processing Conference (EUSIPCO)","title":"Glottal mixture model (GLOMM) for speaker identification on telephone channels","year":"2017","pages":"2734-2738","abstract":"The Glottal Mixture Model (GLOMM) extracts speaker-dependent voice source information from speech data. It has previously been shown to provide speaker identification performance on clean speech comparable to universal background model (UBM), a state of the art method based on MFCC. And, when combined with UBM, the error rate was reduced by a factor of three, showing that the voice source information is largely independent of the information contained in the MFCC, yet holds as much speaker-related information. We now describe how GLOMM can be adapted for telephone quality audio and provide significant error reduction when combined with UBM and I-vector approaches. We demonstrate a factor of two error reduction on the NTIMIT data set with respect to the best published results.","keywords":"feature extraction;mixture models;speaker recognition;GLOMM;telephone channels;speaker-dependent voice source information;speech data;speaker identification performance;MFCC;error rate;telephone quality audio;error reduction;glottal mixture model;speaker-related information;I-vector approach;UBM approach;NTIMIT data set;Mel frequency cepstral coefficient;Speech;Feature extraction;Signal processing algorithms;Telephone sets;Europe","doi":"10.23919/EUSIPCO.2017.8081708","issn":"2076-1465","month":"Aug","url":"https://www.eurasip.org/proceedings/eusipco/eusipco2017/papers/1570343401.pdf","bibtex":"@InProceedings{8081708,\n author = {P. M. Baggenstoss and K. Wilkinghoff and F. Kurth},\n booktitle = {2017 25th European Signal Processing Conference (EUSIPCO)},\n title = {Glottal mixture model (GLOMM) for speaker identification on telephone channels},\n year = {2017},\n pages = {2734-2738},\n abstract = {The Glottal Mixture Model (GLOMM) extracts speaker-dependent voice source information from speech data. It has previously been shown to provide speaker identification performance on clean speech comparable to universal background model (UBM), a state of the art method based on MFCC. And, when combined with UBM, the error rate was reduced by a factor of three, showing that the voice source information is largely independent of the information contained in the MFCC, yet holds as much speaker-related information. We now describe how GLOMM can be adapted for telephone quality audio and provide significant error reduction when combined with UBM and I-vector approaches. We demonstrate a factor of two error reduction on the NTIMIT data set with respect to the best published results.},\n keywords = {feature extraction;mixture models;speaker recognition;GLOMM;telephone channels;speaker-dependent voice source information;speech data;speaker identification performance;MFCC;error rate;telephone quality audio;error reduction;glottal mixture model;speaker-related information;I-vector approach;UBM approach;NTIMIT data set;Mel frequency cepstral coefficient;Speech;Feature extraction;Signal processing algorithms;Telephone sets;Europe},\n doi = {10.23919/EUSIPCO.2017.8081708},\n issn = {2076-1465},\n month = {Aug},\n url = {https://www.eurasip.org/proceedings/eusipco/eusipco2017/papers/1570343401.pdf},\n}\n\n","author_short":["Baggenstoss, P. M.","Wilkinghoff, K.","Kurth, F."],"key":"8081708","id":"8081708","bibbaseid":"baggenstoss-wilkinghoff-kurth-glottalmixturemodelglommforspeakeridentificationontelephonechannels-2017","role":"author","urls":{"Paper":"https://www.eurasip.org/proceedings/eusipco/eusipco2017/papers/1570343401.pdf"},"keyword":["feature extraction;mixture models;speaker recognition;GLOMM;telephone channels;speaker-dependent voice source information;speech data;speaker identification performance;MFCC;error rate;telephone quality audio;error reduction;glottal mixture model;speaker-related information;I-vector approach;UBM approach;NTIMIT data set;Mel frequency cepstral coefficient;Speech;Feature extraction;Signal processing algorithms;Telephone sets;Europe"],"metadata":{"authorlinks":{}},"downloads":0},"bibtype":"inproceedings","biburl":"https://raw.githubusercontent.com/Roznn/EUSIPCO/main/eusipco2017url.bib","creationDate":"2021-02-13T16:38:25.800Z","downloads":0,"keywords":["feature extraction;mixture models;speaker recognition;glomm;telephone channels;speaker-dependent voice source information;speech data;speaker identification performance;mfcc;error rate;telephone quality audio;error reduction;glottal mixture model;speaker-related information;i-vector approach;ubm approach;ntimit data set;mel frequency cepstral coefficient;speech;feature extraction;signal processing algorithms;telephone sets;europe"],"search_terms":["glottal","mixture","model","glomm","speaker","identification","telephone","channels","baggenstoss","wilkinghoff","kurth"],"title":"Glottal mixture model (GLOMM) for speaker identification on telephone channels","year":2017,"dataSources":["2MNbFYjMYTD6z7ExY","uP2aT6Qs8sfZJ6s8b"]}