Learning reconfigurable scene representation by tangram model

Learning reconfigurable scene representation by tangram model. Zhu, J., Wu, T., Zhu, S., Yang, X., & Zhang, W. In IEEE Workshop on Applications of Computer Vision, WACV 2012, Breckenridge, CO, USA, January 9-11, pages 449–456, 2012.

Doi

Learning reconfigurable scene representation by tangram model [pdf]

Paper abstract bibtex

This paper proposes a method to learn reconfigurable and sparse scene representation in the joint space of spatial configuration and appearance in a principled way. We call it the tangram model, which has three properties: (1) Unlike fixed structure of the spatial pyramid widely used in the literature, we propose a compositional shape dictionary organized in an And-Or directed acyclic graph (AOG) to quantize the space of spatial configurations. (2) The shape primitives (called tans) in the dictionary can be described by using any ”off-the-shelf” appearance features according to different tasks. (3) A dynamic programming (DP) algorithm is utilized to learn the globally optimal parse tree in the joint space of spatial configuration and appearance. We demonstrate the tangram model in both a generative learning formulation and a discriminative matching kernel. In experiments, we show that the tangram model is capable of capturing meaningful spatial configurations as well as appearance for various scene categories, and achieves state-of-the-art classification performance on the LSP 15-class scene dataset and the MIT 67-class indoor scene dataset.

@InProceedings{Tangram-WACV,
  author    = {Jun Zhu and Tianfu Wu and Song{-}Chun Zhu and Xiaokang Yang and Wenjun Zhang},
  title     = {Learning reconfigurable scene representation by tangram model},
  booktitle = {{IEEE} Workshop on Applications of Computer Vision, {WACV} 2012, Breckenridge, CO, USA, January 9-11},
  year      = {2012},
  pages     = {449--456},
  url_doi       = {http://dx.doi.org/10.1109/WACV.2012.6163023},
  url_paper = {papers/Tangram_TIP.pdf}, 
  keywords  = {Tangram Model, Scene Layout, And-Or Graph, Dynamic Programming, Scene Categorization},
  abstract  = {This paper proposes a method to learn reconfigurable and sparse scene representation in the joint space of spatial configuration and appearance in a principled way. We call it the tangram model, which has three properties: (1) Unlike fixed structure of the spatial pyramid widely used in the literature, we propose a compositional shape dictionary organized in an And-Or directed acyclic graph (AOG) to quantize the space of spatial configurations. (2) The shape primitives (called tans) in the dictionary can be described by using any ”off-the-shelf” appearance features according to different tasks. (3) A dynamic programming (DP) algorithm is utilized to learn the globally optimal parse tree in the joint space of spatial configuration and appearance. We demonstrate the tangram model in both a generative learning formulation and a discriminative matching kernel. In experiments, we show that the tangram model is capable of capturing meaningful spatial configurations as well as appearance for various scene categories, and achieves state-of-the-art classification performance on the LSP 15-class scene dataset and the MIT 67-class indoor scene dataset.} 
}

Downloads: 0

{"_id":"a6q62QzrSuEf8ELAR","bibbaseid":"zhu-wu-zhu-yang-zhang-learningreconfigurablescenerepresentationbytangrammodel-2012","downloads":0,"creationDate":"2016-09-17T23:01:49.848Z","title":"Learning reconfigurable scene representation by tangram model","author_short":["Zhu, J.","Wu, T.","Zhu, S.","Yang, X.","Zhang, W."],"year":2012,"bibtype":"inproceedings","biburl":"https://tfwu.github.io/TianfuWu_BibTex.bib","bibdata":{"bibtype":"inproceedings","type":"inproceedings","author":[{"firstnames":["Jun"],"propositions":[],"lastnames":["Zhu"],"suffixes":[]},{"firstnames":["Tianfu"],"propositions":[],"lastnames":["Wu"],"suffixes":[]},{"firstnames":["Song-Chun"],"propositions":[],"lastnames":["Zhu"],"suffixes":[]},{"firstnames":["Xiaokang"],"propositions":[],"lastnames":["Yang"],"suffixes":[]},{"firstnames":["Wenjun"],"propositions":[],"lastnames":["Zhang"],"suffixes":[]}],"title":"Learning reconfigurable scene representation by tangram model","booktitle":"IEEE Workshop on Applications of Computer Vision, WACV 2012, Breckenridge, CO, USA, January 9-11","year":"2012","pages":"449–456","url_doi":"http://dx.doi.org/10.1109/WACV.2012.6163023","url_paper":"papers/Tangram_TIP.pdf","keywords":"Tangram Model, Scene Layout, And-Or Graph, Dynamic Programming, Scene Categorization","abstract":"This paper proposes a method to learn reconfigurable and sparse scene representation in the joint space of spatial configuration and appearance in a principled way. We call it the tangram model, which has three properties: (1) Unlike fixed structure of the spatial pyramid widely used in the literature, we propose a compositional shape dictionary organized in an And-Or directed acyclic graph (AOG) to quantize the space of spatial configurations. (2) The shape primitives (called tans) in the dictionary can be described by using any ”off-the-shelf” appearance features according to different tasks. (3) A dynamic programming (DP) algorithm is utilized to learn the globally optimal parse tree in the joint space of spatial configuration and appearance. We demonstrate the tangram model in both a generative learning formulation and a discriminative matching kernel. In experiments, we show that the tangram model is capable of capturing meaningful spatial configurations as well as appearance for various scene categories, and achieves state-of-the-art classification performance on the LSP 15-class scene dataset and the MIT 67-class indoor scene dataset.","bibtex":"@InProceedings{Tangram-WACV,\n author = {Jun Zhu and Tianfu Wu and Song{-}Chun Zhu and Xiaokang Yang and Wenjun Zhang},\n title = {Learning reconfigurable scene representation by tangram model},\n booktitle = {{IEEE} Workshop on Applications of Computer Vision, {WACV} 2012, Breckenridge, CO, USA, January 9-11},\n year = {2012},\n pages = {449--456},\n url_doi = {http://dx.doi.org/10.1109/WACV.2012.6163023},\n url_paper = {papers/Tangram_TIP.pdf}, \n keywords = {Tangram Model, Scene Layout, And-Or Graph, Dynamic Programming, Scene Categorization},\n abstract = {This paper proposes a method to learn reconfigurable and sparse scene representation in the joint space of spatial configuration and appearance in a principled way. We call it the tangram model, which has three properties: (1) Unlike fixed structure of the spatial pyramid widely used in the literature, we propose a compositional shape dictionary organized in an And-Or directed acyclic graph (AOG) to quantize the space of spatial configurations. (2) The shape primitives (called tans) in the dictionary can be described by using any ”off-the-shelf” appearance features according to different tasks. (3) A dynamic programming (DP) algorithm is utilized to learn the globally optimal parse tree in the joint space of spatial configuration and appearance. We demonstrate the tangram model in both a generative learning formulation and a discriminative matching kernel. In experiments, we show that the tangram model is capable of capturing meaningful spatial configurations as well as appearance for various scene categories, and achieves state-of-the-art classification performance on the LSP 15-class scene dataset and the MIT 67-class indoor scene dataset.} \n}\n\n","author_short":["Zhu, J.","Wu, T.","Zhu, S.","Yang, X.","Zhang, W."],"key":"Tangram-WACV","id":"Tangram-WACV","bibbaseid":"zhu-wu-zhu-yang-zhang-learningreconfigurablescenerepresentationbytangrammodel-2012","role":"author","urls":{" doi":"http://dx.doi.org/10.1109/WACV.2012.6163023"," paper":"https://tfwu.github.io/papers/Tangram_TIP.pdf"},"keyword":["Tangram Model","Scene Layout","And-Or Graph","Dynamic Programming","Scene Categorization"],"downloads":0,"html":""},"search_terms":["learning","reconfigurable","scene","representation","tangram","model","zhu","wu","zhu","yang","zhang"],"keywords":["tangram model","scene layout","and-or graph","dynamic programming","scene categorization"],"authorIDs":[],"dataSources":["MMF3y5eBrtyhnDQun"]}