An Object-based Bayesian Framework for Top-down Visual Attention. Borji, A., Sihite, D. N., & Itti, L. In Proc. Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI-12), Toronto, Canada, pages 1529-1535, Aug, 2012. abstract bibtex We introduce a new task-independent framework to model top-down overt visual attention based on graphical models for probabilistic inference and reasoning. We describe a Dynamic Bayesian Network (DBN) that infers probability distributions over attended objects and spatial locations directly from observed data. Probabilistic inference in our model is performed over object-related functions which are fed from manual an- notations of objects in video scenes or by state-of-the-art object detection models. Evaluating over appx. 3 hours (appx. 315,000 eye fixations and 12,600 saccades) of observers playing 3 video games (time-scheduling, driving, and flight combat), we show that our approach is significantly more predictive of eye fixations com- pared to: 1) simpler classifier-based models also developed here that map a signature of a scene (multi-modal information from gist, bottom-up saliency, physical actions, and events) to eye positions, 2) 14 state-of-the-art bottom-up saliency models, and 3) brute-force algo- rithms such as mean eye position. Our results show that the proposed model is more effective in employing and reasoning over spatio-temporal visual data.
@inproceedings{ Borji_etal12aaai,
author = {A. Borji and D. N. Sihite and L. Itti},
title = {An Object-based Bayesian Framework for Top-down Visual Attention},
booktitle = {Proc. Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI-12), Toronto, Canada},
abstract = {We introduce a new task-independent framework to model top-down overt visual attention based on graphical
models for probabilistic inference and reasoning. We describe a Dynamic Bayesian Network (DBN) that
infers probability distributions over attended objects and spatial locations directly from observed
data. Probabilistic inference in our model is performed over object-related functions which are fed
from manual an- notations of objects in video scenes or by state-of-the-art object detection
models. Evaluating over appx. 3 hours (appx. 315,000 eye fixations and 12,600 saccades) of observers
playing 3 video games (time-scheduling, driving, and flight combat), we show that our approach is
significantly more predictive of eye fixations com- pared to: 1) simpler classifier-based models also
developed here that map a signature of a scene (multi-modal information from gist, bottom-up saliency,
physical actions, and events) to eye positions, 2) 14 state-of-the-art bottom-up saliency models, and
3) brute-force algo- rithms such as mean eye position. Our results show that the proposed model is
more effective in employing and reasoning over spatio-temporal visual data.},
pages = {1529-1535},
month = {Aug},
year = {2012},
review = {full/conf},
type = {bu;td;mod;cv},
if = {2012 acceptance rate: 26.0%},
file = {http://ilab.usc.edu/publications/doc/Borji_etal12aaai.pdf}
}
Downloads: 0
{"_id":{"_str":"5298a1a19eb585cc260008c5"},"__v":0,"authorIDs":[],"author_short":["Borji, A.","Sihite, D.<nbsp>N.","Itti, L."],"bibbaseid":"borji-sihite-itti-anobjectbasedbayesianframeworkfortopdownvisualattention-2012","bibdata":{"html":"<div class=\"bibbase_paper\"> \n\n\n<span class=\"bibbase_paper_titleauthoryear\">\n\t<span class=\"bibbase_paper_title\"><a name=\"Borji_etal12aaai\"> </a>An Object-based Bayesian Framework for Top-down Visual Attention.</span>\n\t<span class=\"bibbase_paper_author\">\nBorji, A.; Sihite, D. N.; and Itti, L.</span>\n\t<!-- <span class=\"bibbase_paper_year\">2012</span>. -->\n</span>\n\n\n\nIn\n<i>Proc. Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI-12), Toronto, Canada</i>, page 1529-1535, Aug 2012.\n\n\n\n\n\n<br class=\"bibbase_paper_content\"/>\n\n<span class=\"bibbase_paper_content\">\n \n \n \n <a href=\"javascript:showBib('Borji_etal12aaai')\"\n class=\"bibbase link\">\n <!-- <img src=\"http://www.bibbase.org/img/filetypes/bib.png\" -->\n\t<!-- alt=\"An Object-based Bayesian Framework for Top-down Visual Attention [bib]\" -->\n\t<!-- class=\"bibbase_icon\" -->\n\t<!-- style=\"width: 24px; height: 24px; border: 0px; vertical-align: text-top\"><span class=\"bibbase_icon_text\">Bibtex</span> -->\n BibTeX\n <i class=\"fa fa-caret-down\"></i></a>\n \n \n \n <a class=\"bibbase_abstract_link bibbase link\"\n href=\"javascript:showAbstract('Borji_etal12aaai')\">\n Abstract\n <i class=\"fa fa-caret-down\"></i></a>\n \n \n \n\n \n \n \n</span>\n\n<div class=\"well well-small bibbase\" id=\"bib_Borji_etal12aaai\"\n style=\"display:none\">\n <pre>@inproceedings{ Borji_etal12aaai,\n author = {A. Borji and D. N. Sihite and L. Itti},\n title = {An Object-based Bayesian Framework for Top-down Visual Attention},\n booktitle = {Proc. Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI-12), Toronto, Canada},\n abstract = {We introduce a new task-independent framework to model top-down overt visual attention based on graphical\n models for probabilistic inference and reasoning. We describe a Dynamic Bayesian Network (DBN) that\n infers probability distributions over attended objects and spatial locations directly from observed\n data. Probabilistic inference in our model is performed over object-related functions which are fed\n from manual an- notations of objects in video scenes or by state-of-the-art object detection\n models. Evaluating over appx. 3 hours (appx. 315,000 eye fixations and 12,600 saccades) of observers\n playing 3 video games (time-scheduling, driving, and flight combat), we show that our approach is\n significantly more predictive of eye fixations com- pared to: 1) simpler classifier-based models also\n developed here that map a signature of a scene (multi-modal information from gist, bottom-up saliency,\n physical actions, and events) to eye positions, 2) 14 state-of-the-art bottom-up saliency models, and\n 3) brute-force algo- rithms such as mean eye position. Our results show that the proposed model is\n more effective in employing and reasoning over spatio-temporal visual data.},\n pages = {1529-1535},\n month = {Aug},\n year = {2012},\n review = {full/conf},\n type = {bu;td;mod;cv},\n if = {2012 acceptance rate: 26.0%},\n file = {http://ilab.usc.edu/publications/doc/Borji_etal12aaai.pdf}\n}</pre>\n</div>\n\n\n<div class=\"well well-small bibbase\" id=\"abstract_Borji_etal12aaai\"\n style=\"display:none\">\n We introduce a new task-independent framework to model top-down overt visual attention based on graphical models for probabilistic inference and reasoning. We describe a Dynamic Bayesian Network (DBN) that infers probability distributions over attended objects and spatial locations directly from observed data. Probabilistic inference in our model is performed over object-related functions which are fed from manual an- notations of objects in video scenes or by state-of-the-art object detection models. Evaluating over appx. 3 hours (appx. 315,000 eye fixations and 12,600 saccades) of observers playing 3 video games (time-scheduling, driving, and flight combat), we show that our approach is significantly more predictive of eye fixations com- pared to: 1) simpler classifier-based models also developed here that map a signature of a scene (multi-modal information from gist, bottom-up saliency, physical actions, and events) to eye positions, 2) 14 state-of-the-art bottom-up saliency models, and 3) brute-force algo- rithms such as mean eye position. Our results show that the proposed model is more effective in employing and reasoning over spatio-temporal visual data.\n</div>\n\n\n</div>\n","downloads":0,"bibbaseid":"borji-sihite-itti-anobjectbasedbayesianframeworkfortopdownvisualattention-2012","role":"author","year":"2012","type":"bu;td;mod;cv","title":"An Object-based Bayesian Framework for Top-down Visual Attention","review":"full/conf","pages":"1529-1535","month":"Aug","key":"Borji_etal12aaai","if":"2012 acceptance rate: 26.0%","id":"Borji_etal12aaai","file":"http://ilab.usc.edu/publications/doc/Borji_etal12aaai.pdf","booktitle":"Proc. Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI-12), Toronto, Canada","bibtype":"inproceedings","bibtex":"@inproceedings{ Borji_etal12aaai,\n author = {A. Borji and D. N. Sihite and L. Itti},\n title = {An Object-based Bayesian Framework for Top-down Visual Attention},\n booktitle = {Proc. Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI-12), Toronto, Canada},\n abstract = {We introduce a new task-independent framework to model top-down overt visual attention based on graphical\n models for probabilistic inference and reasoning. We describe a Dynamic Bayesian Network (DBN) that\n infers probability distributions over attended objects and spatial locations directly from observed\n data. Probabilistic inference in our model is performed over object-related functions which are fed\n from manual an- notations of objects in video scenes or by state-of-the-art object detection\n models. Evaluating over appx. 3 hours (appx. 315,000 eye fixations and 12,600 saccades) of observers\n playing 3 video games (time-scheduling, driving, and flight combat), we show that our approach is\n significantly more predictive of eye fixations com- pared to: 1) simpler classifier-based models also\n developed here that map a signature of a scene (multi-modal information from gist, bottom-up saliency,\n physical actions, and events) to eye positions, 2) 14 state-of-the-art bottom-up saliency models, and\n 3) brute-force algo- rithms such as mean eye position. Our results show that the proposed model is\n more effective in employing and reasoning over spatio-temporal visual data.},\n pages = {1529-1535},\n month = {Aug},\n year = {2012},\n review = {full/conf},\n type = {bu;td;mod;cv},\n if = {2012 acceptance rate: 26.0%},\n file = {http://ilab.usc.edu/publications/doc/Borji_etal12aaai.pdf}\n}","author_short":["Borji, A.","Sihite, D.<nbsp>N.","Itti, L."],"author":["Borji, A.","Sihite, D. N.","Itti, L."],"abstract":"We introduce a new task-independent framework to model top-down overt visual attention based on graphical models for probabilistic inference and reasoning. We describe a Dynamic Bayesian Network (DBN) that infers probability distributions over attended objects and spatial locations directly from observed data. Probabilistic inference in our model is performed over object-related functions which are fed from manual an- notations of objects in video scenes or by state-of-the-art object detection models. Evaluating over appx. 3 hours (appx. 315,000 eye fixations and 12,600 saccades) of observers playing 3 video games (time-scheduling, driving, and flight combat), we show that our approach is significantly more predictive of eye fixations com- pared to: 1) simpler classifier-based models also developed here that map a signature of a scene (multi-modal information from gist, bottom-up saliency, physical actions, and events) to eye positions, 2) 14 state-of-the-art bottom-up saliency models, and 3) brute-force algo- rithms such as mean eye position. Our results show that the proposed model is more effective in employing and reasoning over spatio-temporal visual data."},"bibtype":"inproceedings","biburl":"http://ilab.usc.edu/publications/src/ilab.bib","downloads":0,"search_terms":["object","based","bayesian","framework","top","down","visual","attention","borji","sihite","itti"],"title":"An Object-based Bayesian Framework for Top-down Visual Attention","year":2012,"dataSources":["wedBDxEpNXNCLZ2sZ"]}