What can saliency models predict about eye movements? Spatial and sequential aspects of fixations during encoding and recognition. Foulsham, T. & Underwood, G. J Vis, 8:1--17, 2008.
abstract   bibtex   
Saliency map models account for a small but significant amount of the variance in where people fixate, but evaluating these models with natural stimuli has led to mixed results. In the present study, the eye movements of participants were recorded while they viewed color photographs of natural scenes in preparation for a memory test (encoding) and when recognizing them later. These eye movements were then compared to the predictions of a well defined saliency map model (L. Itti & C. Koch, 2000), in terms of both individual fixation locations and fixation sequences (scanpaths). The saliency model is a significantly better predictor of fixation location than random models that take into account bias toward central fixations, and this is the case at both encoding and recognition. However, similarity between scanpaths made at multiple viewings of the same stimulus suggests that repetitive scanpaths also contribute to where people look. Top-down recapitulation of scanpaths is a key prediction of scanpath theory (D. Noton & L. Stark, 1971), but it might also be explained by bottom-up guidance. The present data suggest that saliency cannot account for scanpaths and that incorporating these sequences could improve model predictions.
@article{ Foulsham_Underwood08,
  author = {Foulsham, T. and Underwood, G.},
  title = {{W}hat can saliency models predict about eye movements? {S}patial
	and sequential aspects of fixations during encoding and recognition},
  journal = {J Vis},
  year = {2008},
  volume = {8},
  pages = {1--17},
  abstract = {Saliency map models account for a small but significant amount of
	the variance in where people fixate, but evaluating these models
	with natural stimuli has led to mixed results. In the present study,
	the eye movements of participants were recorded while they viewed
	color photographs of natural scenes in preparation for a memory test
	(encoding) and when recognizing them later. These eye movements were
	then compared to the predictions of a well defined saliency map model
	(L. Itti & C. Koch, 2000), in terms of both individual fixation locations
	and fixation sequences (scanpaths). The saliency model is a significantly
	better predictor of fixation location than random models that take
	into account bias toward central fixations, and this is the case
	at both encoding and recognition. However, similarity between scanpaths
	made at multiple viewings of the same stimulus suggests that repetitive
	scanpaths also contribute to where people look. Top-down recapitulation
	of scanpaths is a key prediction of scanpath theory (D. Noton & L.
	Stark, 1971), but it might also be explained by bottom-up guidance.
	The present data suggest that saliency cannot account for scanpaths
	and that incorporating these sequences could improve model predictions.}
}

Downloads: 0