Gaussian Power flow Orientation Coefficients for noise-robust speech recognition. Gerazov, B. & Ivanovski, Z. In 2014 22nd European Signal Processing Conference (EUSIPCO), pages 1467-1471, Sep., 2014.
Gaussian Power flow Orientation Coefficients for noise-robust speech recognition [pdf]Paper  abstract   bibtex   
Spectro-temporal features have shown a great promise in respect to improving the noise-robustness of Automatic Speech Recognition (ASR) systems. The common approach uses a bank of 2D Gabor filters to process the speech signal spectrogram and generate the output feature vector. This approach suffers from generating a large number of coefficients, thus necessitating the use of feature dimensionality reduction. The proposed Gaussian Power flow Orientation Coefficients (GPOCs) use an alternative approach in which only the largest coefficients output from a bank of 2D Gaussian kernels are used to describe the spectro-temporal patterns of power flow in the auditory spectrogram. Whilst reducing the size of the feature vectors, the algorithm was shown to outperform traditional feature extraction methods, even a reference spectro-temporal approach, for low SNRs. Its performance for high SNRs is comparable but inferior to traditional ASR frontends, while falling behind state-of-the-art algorithms in all noise scenarios.

Downloads: 0