Automatically estimating emotion in music with deep long-short term memory recurrent neural networks. Coutinho, E., Trigeorgis, G., Zafeiriou, S., & Schuller, B. In Larson, M., Ionescu, B., Sjöberg, M., Anguera, X., Poignant, J., Riegler, M., Eskevich, M., Hauff, C., Sutcliffe, R., Jones, G., J., Yang, Y., Soleymani, M., & Papadopoulos, S., editors, CEUR Workshop Proceedings, volume 1436, pages 1-3, 9, 2015. CEUR.
Automatically estimating emotion in music with deep long-short term memory recurrent neural networks [pdf]Paper  Automatically estimating emotion in music with deep long-short term memory recurrent neural networks [pdf]Website  abstract   bibtex   1 download  
In this paper we describe our approach for the MediaEval's "Emotion in Music" task. Our method consists of deep Long-Short Term Memory Recurrent Neural Networks (LSTM-RNN) for dynamic Arousal and Valence regression, using acoustic and psychoacoustic features extracted from the songs that have been previously proven as effective for emotion prediction in music. Results on the challenge test demonstrate an excellent performance for Arousal estimation (r = 0.613 ± 0.278), but not for Valence (r = 0.026 ± 0.500). Issues regarding the quality of the test set annotations' reliability and distributions are indicated as plausible justifications for these results. By using a subset of the development set that was left out for performance estimation, we could determine that the performance of our approach may be underestimated for Valence (Arousal: r = 0.596 ± 0.386; Valence: r = 0.458 ± 0.551).

Downloads: 1