Preserving Actual Dynamic Trend of Emotion in Dimensional Speech Emotion Recognition. Han, W., Li, H., Eyben, F., Ma, L., Sun, J., & Schuller, B. In Proceedings of the 14th ACM International Conference on Multimodal Interaction, of ICMI '12, pages 523--528, New York, NY, USA, 2012. ACM.
Preserving Actual Dynamic Trend of Emotion in Dimensional Speech Emotion Recognition [link]Paper  doi  abstract   bibtex   
In this paper, we use the concept of dynamic trend of emotion to describe how a human's emotion changes over time, which is believed to be important for understanding one's stance toward current topic in interactions. However, the importance of this concept - to our best knowledge - has not been paid enough attention before in the field of speech emotion recognition (SER). Inspired by this, this paper aims to evoke researchers' attention on this concept and makes a primary effort on the research of predicting correct dynamic trend of emotion in the process of SER. Specifically, we propose a novel algorithm named Order Preserving Network (OPNet) to this end. First, as the key issue for OPNet construction, we propose employing a probabilistic method to define an emotion trend-sensitive loss function. Then, a nonlinear neural network is trained using the gradient descent as optimization algorithm to minimize the constructed loss function. We validated the prediction performance of OPNet on the VAM corpus, by mean linear error as well as a rank correlation coefficient γ as measures. Comparing to k-Nearest Neighbor and support vector regression, the proposed OPNet performs better on the preservation of actual dynamic trend of emotion.
@inproceedings{han_preserving_2012,
	address = {New York, NY, USA},
	series = {{ICMI} '12},
	title = {Preserving {Actual} {Dynamic} {Trend} of {Emotion} in {Dimensional} {Speech} {Emotion} {Recognition}},
	isbn = {978-1-4503-1467-1},
	url = {http://doi.acm.org/10.1145/2388676.2388786},
	doi = {10.1145/2388676.2388786},
	abstract = {In this paper, we use the concept of dynamic trend of emotion to describe how a human's emotion changes over time, which is believed to be important for understanding one's stance toward current topic in interactions. However, the importance of this concept - to our best knowledge - has not been paid enough attention before in the field of speech emotion recognition (SER). Inspired by this, this paper aims to evoke researchers' attention on this concept and makes a primary effort on the research of predicting correct dynamic trend of emotion in the process of SER. Specifically, we propose a novel algorithm named Order Preserving Network (OPNet) to this end. First, as the key issue for OPNet construction, we propose employing a probabilistic method to define an emotion trend-sensitive loss function. Then, a nonlinear neural network is trained using the gradient descent as optimization algorithm to minimize the constructed loss function. We validated the prediction performance of OPNet on the VAM corpus, by mean linear error as well as a rank correlation coefficient γ as measures. Comparing to k-Nearest Neighbor and support vector regression, the proposed OPNet performs better on the preservation of actual dynamic trend of emotion.},
	urldate = {2014-06-05TZ},
	booktitle = {Proceedings of the 14th {ACM} {International} {Conference} on {Multimodal} {Interaction}},
	publisher = {ACM},
	author = {Han, Wenjing and Li, Haifeng and Eyben, Florian and Ma, Lin and Sun, Jiayin and Schuller, Björn},
	year = {2012},
	pages = {523--528}
}

Downloads: 0