Picture Tags and World Knowledge: Learning Tag Relations from Visual Semantic Sources

Picture Tags and World Knowledge: Learning Tag Relations from Visual Semantic Sources. Xie, L. & He, X. In Proceedings of the 21st ACM International Conference on Multimedia, of MM '13, pages 967–976, New York, NY, USA, 2013. ACM.

Paper

Picture Tags and World Knowledge: Learning Tag Relations from Visual Semantic Sources [pdf]

Paper

Slides

Page doi abstract bibtex

This paper studies the use of everyday words to describe images. The common saying has it that 'a picture is worth a thousand words', here we ask which thousand? The proliferation of tagged social multimedia data presents a challenge to understanding collective tag-use at large scale – one can ask if patterns from photo tags help understand tag-tag relations, and how it can be leveraged to improve visual search and recognition. We propose a new method to jointly analyze three distinct visual knowledge resources: Flickr, ImageNet/WordNet, and ConceptNet. This allows us to quantify the visual relevance of both tags learn their relationships. We propose a novel network estimation algorithm, Inverse Concept Rank, to infer incomplete tag relationships. We then design an algorithm for image annotation that takes into account both image and tag features. We analyze over 5 million photos with over 20,000 visual tags. The statistics from this collection leads to good results for image tagging, relationship estimation, and generalizing to unseen tags. This is a first step in analyzing picture tags and everyday semantic knowledge. Potential other applications include generating natural language descriptions of pictures, as well as validating and supplementing knowledge databases.

@inproceedings{Xie:2013:PTW:2502081.2502113,
  author     = {Xie, Lexing and He, Xuming},
  title      = {{Picture Tags and World Knowledge: Learning Tag Relations from Visual Semantic Sources}},
  booktitle  = {Proceedings of the 21st ACM International Conference on Multimedia},
  series     = {MM '13},
  year       = {2013},
  isbn       = {978-1-4503-2404-5},
  location   = {Barcelona, Spain},
  pages      = {967--976},
  numpages   = {10},
  url        = {http://doi.acm.org/10.1145/2502081.2502113},
  doi        = {10.1145/2502081.2502113},
  acmid      = {2502113},
  publisher  = {ACM},
  address    = {New York, NY, USA},
  keywords   = {folksonomy, knowledge graph, social media},
  abstract   = {This paper studies the use of everyday words to describe images. The common saying has it that 'a picture is worth a thousand words', here we ask which thousand? The proliferation of tagged social multimedia data presents a challenge to understanding collective tag-use at large scale -- one can ask if patterns from photo tags help understand tag-tag relations, and how it can be leveraged to improve visual search and recognition. We propose a new method to jointly analyze three distinct visual knowledge resources: Flickr, ImageNet/WordNet, and ConceptNet. This allows us to quantify the visual relevance of both tags learn their relationships. We propose a novel network estimation algorithm, Inverse Concept Rank, to infer incomplete tag relationships. We then design an algorithm for image annotation that takes into account both image and tag features. We analyze over 5 million photos with over 20,000 visual tags. The statistics from this collection leads to good results for image tagging, relationship estimation, and generalizing to unseen tags. This is a first step in analyzing picture tags and everyday semantic knowledge. Potential other applications include generating natural language descriptions of pictures, as well as validating and supplementing knowledge databases.},
  url_paper  = {http://cecs.anu.edu.au/~xlx/papers/mm2013-xie.pdf},
  url_slides = {http://cecs.anu.edu.au/~xlx/proj/tagnet/mm2013-tagnet.pdf},
  url_page   = {http://users.cecs.anu.edu.au/~xlx/proj/tagnet/}
}

Downloads: 0

{"_id":"h87ubZwXxQSyoH6ky","bibbaseid":"xie-he-picturetagsandworldknowledgelearningtagrelationsfromvisualsemanticsources-2013","downloads":0,"creationDate":"2016-02-16T23:55:01.779Z","title":"Picture Tags and World Knowledge: Learning Tag Relations from Visual Semantic Sources","author_short":["Xie, L.","He, X."],"year":2013,"bibtype":"inproceedings","biburl":"http://cm.cecs.anu.edu.au/documents/publications.bib","bibdata":{"bibtype":"inproceedings","type":"inproceedings","author":[{"propositions":[],"lastnames":["Xie"],"firstnames":["Lexing"],"suffixes":[]},{"propositions":[],"lastnames":["He"],"firstnames":["Xuming"],"suffixes":[]}],"title":"Picture Tags and World Knowledge: Learning Tag Relations from Visual Semantic Sources","booktitle":"Proceedings of the 21st ACM International Conference on Multimedia","series":"MM '13","year":"2013","isbn":"978-1-4503-2404-5","location":"Barcelona, Spain","pages":"967–976","numpages":"10","url":"http://doi.acm.org/10.1145/2502081.2502113","doi":"10.1145/2502081.2502113","acmid":"2502113","publisher":"ACM","address":"New York, NY, USA","keywords":"folksonomy, knowledge graph, social media","abstract":"This paper studies the use of everyday words to describe images. The common saying has it that 'a picture is worth a thousand words', here we ask which thousand? The proliferation of tagged social multimedia data presents a challenge to understanding collective tag-use at large scale – one can ask if patterns from photo tags help understand tag-tag relations, and how it can be leveraged to improve visual search and recognition. We propose a new method to jointly analyze three distinct visual knowledge resources: Flickr, ImageNet/WordNet, and ConceptNet. This allows us to quantify the visual relevance of both tags learn their relationships. We propose a novel network estimation algorithm, Inverse Concept Rank, to infer incomplete tag relationships. We then design an algorithm for image annotation that takes into account both image and tag features. We analyze over 5 million photos with over 20,000 visual tags. The statistics from this collection leads to good results for image tagging, relationship estimation, and generalizing to unseen tags. This is a first step in analyzing picture tags and everyday semantic knowledge. Potential other applications include generating natural language descriptions of pictures, as well as validating and supplementing knowledge databases.","url_paper":"http://cecs.anu.edu.au/~xlx/papers/mm2013-xie.pdf","url_slides":"http://cecs.anu.edu.au/~xlx/proj/tagnet/mm2013-tagnet.pdf","url_page":"http://users.cecs.anu.edu.au/~xlx/proj/tagnet/","bibtex":"@inproceedings{Xie:2013:PTW:2502081.2502113,\n author = {Xie, Lexing and He, Xuming},\n title = {{Picture Tags and World Knowledge: Learning Tag Relations from Visual Semantic Sources}},\n booktitle = {Proceedings of the 21st ACM International Conference on Multimedia},\n series = {MM '13},\n year = {2013},\n isbn = {978-1-4503-2404-5},\n location = {Barcelona, Spain},\n pages = {967--976},\n numpages = {10},\n url = {http://doi.acm.org/10.1145/2502081.2502113},\n doi = {10.1145/2502081.2502113},\n acmid = {2502113},\n publisher = {ACM},\n address = {New York, NY, USA},\n keywords = {folksonomy, knowledge graph, social media},\n abstract = {This paper studies the use of everyday words to describe images. The common saying has it that 'a picture is worth a thousand words', here we ask which thousand? The proliferation of tagged social multimedia data presents a challenge to understanding collective tag-use at large scale -- one can ask if patterns from photo tags help understand tag-tag relations, and how it can be leveraged to improve visual search and recognition. We propose a new method to jointly analyze three distinct visual knowledge resources: Flickr, ImageNet/WordNet, and ConceptNet. This allows us to quantify the visual relevance of both tags learn their relationships. We propose a novel network estimation algorithm, Inverse Concept Rank, to infer incomplete tag relationships. We then design an algorithm for image annotation that takes into account both image and tag features. We analyze over 5 million photos with over 20,000 visual tags. The statistics from this collection leads to good results for image tagging, relationship estimation, and generalizing to unseen tags. This is a first step in analyzing picture tags and everyday semantic knowledge. Potential other applications include generating natural language descriptions of pictures, as well as validating and supplementing knowledge databases.},\n url_paper = {http://cecs.anu.edu.au/~xlx/papers/mm2013-xie.pdf},\n url_slides = {http://cecs.anu.edu.au/~xlx/proj/tagnet/mm2013-tagnet.pdf},\n url_page = {http://users.cecs.anu.edu.au/~xlx/proj/tagnet/}\n}\n\n","author_short":["Xie, L.","He, X."],"key":"Xie:2013:PTW:2502081.2502113","id":"Xie:2013:PTW:2502081.2502113","bibbaseid":"xie-he-picturetagsandworldknowledgelearningtagrelationsfromvisualsemanticsources-2013","role":"author","urls":{"Paper":"http://doi.acm.org/10.1145/2502081.2502113"," paper":"http://cecs.anu.edu.au/~xlx/papers/mm2013-xie.pdf"," slides":"http://cecs.anu.edu.au/~xlx/proj/tagnet/mm2013-tagnet.pdf"," page":"http://users.cecs.anu.edu.au/~xlx/proj/tagnet/"},"keyword":["folksonomy","knowledge graph","social media"],"metadata":{"authorlinks":{}}},"search_terms":["picture","tags","world","knowledge","learning","tag","relations","visual","semantic","sources","xie","he"],"keywords":["folksonomy","knowledge graph","social media"],"authorIDs":["5696294beaed061c6e00016d"],"dataSources":["CsTNFAMJm4FoD3yFY","8bdqaFbAALQsNy4E3"]}