AMECON: Abstract Meta-Concept Features for Text-Illustration. Chami, I., Tamaazousti, Y., & Le Borgne, H. In ACM International Conference on Multimedia Retrieval (ICMR), Bucharest, 2017.
AMECON: Abstract Meta-Concept Features for Text-Illustration [pdf]Pdf  AMECON: Abstract Meta-Concept Features for Text-Illustration [pdf]Slides  doi  abstract   bibtex   
Cross-media retrieval is a problem of high interest that is at the frontier between computer vision and natural language processing. The state-of-the-art in the domain consists of learning a common space with regard to some constraints of correlation or similarity from two textual and visual modalities that are processed in parallel and possibly jointly. This paper proposes a different approach that considers the cross-modal problem as a supervised mapping of visual modalities to textual ones. Each modality is thus seen as a particular projection of an abstract meta-concept, each of its dimension subsuming several semantic concepts (``meta'' aspect) but may not correspond to an actual one (``abstract'' aspect). In practice, the textual modality is used to generate a multi-label representation, further used to map the visual modality through a simple shallow neural network. While being quite easy to implement, the experiments show that our approach significantly outperforms the state-of-the-art on Flickr-8K and Flickr-30K datasets for the text-illustration task

Downloads: 0