A survey of multimodal sentiment analysis. Soleymani, M., Garcia, D., Jou, B., Schuller, B., Chang, S., & Pantic, M. Image and Vision Computing, 65:3–14, September, 2017.
A survey of multimodal sentiment analysis [link]Paper  doi  abstract   bibtex   
Sentiment analysis aims to automatically uncover the underlying attitude that we hold towards an entity. The aggregation of these sentiment over a population represents opinion polling and has numerous applications. Current text-based sentiment analysis rely on the construction of dictionaries and machine learning models that learn sentiment from large text corpora. Sentiment analysis from text is currently widely used for customer satisfaction assessment and brand perception analysis, among others. With the proliferation of social media, multimodal sentiment analysis is set to bring new opportunities with the arrival of complementary data streams for improving and going beyond text-based sentiment analysis. Since sentiment can be detected through affective traces it leaves, such as facial and vocal displays, multimodal sentiment analysis offers promising avenues for analyzing facial and vocal expressions in addition to the transcript or textual content. These approaches leverage emotion recognition and context inference to determine the underlying polarity and scope of an individual's sentiment. In this survey, we define sentiment and the problem of multimodal sentiment analysis and review recent developments in multimodal sentiment analysis in different domains, including spoken reviews, images, video blogs, human–machine and human–human interactions. Challenges and opportunities of this emerging field are also discussed leading to our thesis that multimodal sentiment analysis holds a significant untapped potential.
@article{soleymani_survey_2017,
	series = {{EI}},
	title = {A survey of multimodal sentiment analysis},
	volume = {65},
	copyright = {3.069},
	issn = {0262-8856},
	url = {https://www.sciencedirect.com/science/article/pii/S0262885617301191},
	doi = {10.1016/j.imavis.2017.08.003},
	abstract = {Sentiment analysis aims to automatically uncover the underlying attitude that we hold towards an entity. The aggregation of these sentiment over a population represents opinion polling and has numerous applications. Current text-based sentiment analysis rely on the construction of dictionaries and machine learning models that learn sentiment from large text corpora. Sentiment analysis from text is currently widely used for customer satisfaction assessment and brand perception analysis, among others. With the proliferation of social media, multimodal sentiment analysis is set to bring new opportunities with the arrival of complementary data streams for improving and going beyond text-based sentiment analysis. Since sentiment can be detected through affective traces it leaves, such as facial and vocal displays, multimodal sentiment analysis offers promising avenues for analyzing facial and vocal expressions in addition to the transcript or textual content. These approaches leverage emotion recognition and context inference to determine the underlying polarity and scope of an individual's sentiment. In this survey, we define sentiment and the problem of multimodal sentiment analysis and review recent developments in multimodal sentiment analysis in different domains, including spoken reviews, images, video blogs, human–machine and human–human interactions. Challenges and opportunities of this emerging field are also discussed leading to our thesis that multimodal sentiment analysis holds a significant untapped potential.},
	language = {en},
	urldate = {2022-11-26},
	journal = {Image and Vision Computing},
	author = {Soleymani, Mohammad and Garcia, David and Jou, Brendan and Schuller, Björn and Chang, Shih-Fu and Pantic, Maja},
	month = sep,
	year = {2017},
	pages = {3--14},
}

Downloads: 0