Stereophonic music separation based on non-negative tensor factorization with cepstrum regularization. Seki, S., Toda, T., & Takeda, K. In 2017 25th European Signal Processing Conference (EUSIPCO), pages 981-985, Aug, 2017.
Stereophonic music separation based on non-negative tensor factorization with cepstrum regularization [pdf]Paper  doi  abstract   bibtex   
This paper presents a novel approach to stereophonic music separation based on Non-negative Tensor Factorization (NTF). Stereophonic music is roughly divided into two types; recorded music or synthesized music, which we focus on synthesized one in this paper. Synthesized music signals are often generated as linear combinations of many individual source signals with their mixing gains (i.e., time-invariant amplitude scaling) to each channel signal. Therefore, the synthesized stereophonic music separation is the underdetermined source separation problem where phase components are not helpful for the separation. NTF is one of the effective techniques to handle this problem, decomposing amplitude spectrograms of the stereo channel music signal into basis vectors and activations of individual music source signals and their corresponding mixing gains. However, it is essentially difficult to obtain sufficient separation performance in this separation problem as available acoustic cues for separation are limited. To address this issue, we propose a cepstrum regularization method for NTF-based stereo channel separation. The proposed method makes the separated music source signals follow the corresponding Gaussian mixture models of individual music source signals, which are trained in advance using their available samples. An experimental evaluation using real music signals is conducted to investigate the effectiveness of the proposed method in both supervised and unsupervised separation frameworks. The experimental results demonstrate that the proposed method yields significant improvements in separation performance in both frameworks.

Downloads: 0