Perceptual quality of audio separated using sigmoidal masks. Stokes, T., Hummersone, C., Brookes, T., & Mason, A. In Proceedings of the 137th Audio Engineering Society Convention, Los Angeles, October, 2014.
Perceptual quality of audio separated using sigmoidal masks [pdf]Paper  abstract   bibtex   
Separation of underdetermined audio mixtures is often performed in the Time-Frequency (TF) domain by masking each TF element according to the amount of target energy it is deemed to contain. This work uses sigmoidal functions to map the proportion of target energy to mask values. The series of sigmoidal functions used encompasses the ratio mask and an approximation of the binary mask. Mixtures are chosen to represent a range of different amounts of TF overlap, then separated and evaluated using objective measures. PEASS results show improved interferer suppression and artifact scores can be achieved using softer masking than that applied by binary or ratio masks. This improves the overall perceptual score of the separated audio.

Downloads: 0