SentiCap: Generating Image Descriptions with Sentiments. Mathews, A., Xie, L., & He, X. In Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16), Phoenix, Arizona USA, 2016.
SentiCap: Generating Image Descriptions with Sentiments [link]Abstract  SentiCap: Generating Image Descriptions with Sentiments [pdf]Paper  SentiCap: Generating Image Descriptions with Sentiments [pdf]Slides  abstract   bibtex   
The recent progress on image recognition and language modeling is making automatic description of image content a reality. However, stylized, non-factual aspects of the written description are missing from the current systems. One such style is descriptions with emotions, which is commonplace in everyday communication, and influences decision-making and interpersonal relationships. We design a system to describe an image with emotions, and present a model that automatically generates captions with positive or negative sentiments. We propose a novel switching recurrent neural network with word-level regularization, which is able to produce emotional image captions using only 2000+ training sentences containing sentiments. We evaluate the captions with different automatic and crowd-sourcing metrics. Our model compares favourably in common quality metrics for image captioning. In 84.6% of cases the generated positive captions were judged as being at least as descriptive as the factual captions. Of these positive captions 88% were confirmed by the crowd-sourced workers as having the appropriate sentiment.
@inproceedings{mathews2016senticap,
  title={{SentiCap: Generating Image Descriptions with Sentiments}},
  author={Mathews, Alexander and Xie, Lexing and He, Xuming},
  booktitle={Thirtieth {AAAI} Conference on Artificial Intelligence ({AAAI-16})},
  url_Abstract={http://arxiv.org/abs/1510.01431},
  url_Paper={http://arxiv.org/pdf/1510.01431v2.pdf},
  url_Slides={http://cm.cecs.anu.edu.au/documents/senticap_slides.pdf},
  address = {Phoenix, Arizona USA},
  year={2016},
  abstract={The recent progress on image recognition and language modeling is making automatic description of image content a reality. However, stylized, non-factual aspects of the written description are missing from the current systems. One such style is descriptions with emotions, which is commonplace in everyday communication, and influences decision-making and interpersonal relationships. We design a system to describe an image with emotions, and present a model that automatically generates captions with positive or negative sentiments. We propose a novel switching recurrent neural network with word-level regularization, which is able to produce emotional image captions using only 2000+ training sentences containing sentiments. We evaluate the captions with different automatic and crowd-sourcing metrics. Our model compares favourably in common quality metrics for image captioning. In 84.6\% of cases the generated positive captions were judged as being at least as descriptive as the factual captions. Of these positive captions 88\% were confirmed by the crowd-sourced workers as having the appropriate sentiment.},
}
Downloads: 0