Probabilistic Learning of Task-Specific Visual Attention. Borji, A., Sihite, D. N., & Itti, L. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, Rhode Island, pages 1-8, Jun, 2012.
abstract   bibtex   
Despite a considerable amount of previous work on bottom-up saliency modeling for predicting human fixations over static and dynamic stimuli, few studies have thus far attempted to model top-down and task-driven influences of visual attention. Here, taking advantage of the sequential nature of real-world tasks, we propose a unified Bayesian approach for modeling task-driven visual attention. Several sources of information, including global context of a scene, previous attended locations, and previous motor actions, are integrated over time to predict the next attended location. Recording eye movements while subjects engage in 5 contemporary 2D and 3D video games, as modest counterparts of everyday tasks, we show that our approach is able to predict human attention and gaze better than the state-of-the-art, with a large margin (about 15 percent increase in prediction accuracy). The advantage of our approach is that it is automatic and applicable to arbitrary visual tasks.
@inproceedings{ Borji_etal12cvpr,
  author = {A. Borji and D. N. Sihite and L. Itti},
  title = {Probabilistic Learning of Task-Specific Visual Attention},
  booktitle = {Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, Rhode Island},
  abstract = {Despite a considerable amount of previous work on bottom-up saliency modeling for predicting human fixations
                  over static and dynamic stimuli, few studies have thus far attempted to model top-down and task-driven
                  influences of visual attention. Here, taking advantage of the sequential nature of real-world tasks,
                  we propose a unified Bayesian approach for modeling task-driven visual attention. Several sources of
                  information, including global context of a scene, previous attended locations, and previous motor
                  actions, are integrated over time to predict the next attended location. Recording eye movements while
                  subjects engage in 5 contemporary 2D and 3D video games, as modest counterparts of everyday tasks, we
                  show that our approach is able to predict human attention and gaze better than the state-of-the-art,
                  with a large margin (about 15 percent increase in prediction accuracy). The advantage of our approach
                  is that it is automatic and applicable to arbitrary visual tasks.},
  month = {Jun},
  pages = {1-8},
  year = {2012},
  review = {full/conf},
  type = {bu;td;mod;cv},
  if = {2012 acceptance rate: 26.2%},
  file = {http://ilab.usc.edu/publications/doc/Borji_etal12cvpr.pdf}
}

Downloads: 0