Real-Time High-Performance Attention Focusing in Outdoors Color Video Streams

Real-Time High-Performance Attention Focusing in Outdoors Color Video Streams. Itti, L. In Rogowitz, B. & Pappas, T. N., editors, Proc. SPIE Human Vision and Electronic Imaging VII (HVEI'02), San Jose, CA, pages 235-243, Bellingham, WA, Jan, 2002. SPIE Press.
abstract bibtex

When confronted with cluttered natural environments, animals still perform orders of magnitude better than artificial vision systems in tasks such as orienting, target detection, navigation and scene understanding. The recent widespread availability of significant computational resources, however, in particular through the deployment of so-called "Beowulf" clusters of low-cost personal computers, leaves us little excuse for the enormous gap still separating biological from machine vision systems. We describe a neuromorphic model of how our visual attention is attracted towards conspicuous locations in a visual scene. It replicates processing in posterior parietal cortex and other brain areas along the dorsal visual stream in the primate brain. The model includes a bottom-up (image-based) computation of low-level color, intensity, orientation and motion features, as well as a non-linear spatial competition which enhances salient locations in each of these feature channels. All feature channels feed into a unique scalar "saliency map" which controls where to next focus attention onto. Because it includes a detailed low-level vision front-end, the model has been applied not only to laboratory stimuli, but also to a wide variety of natural scenes. In addition to predicting a wealth of psychophysical experiments, the model demonstrated remarkable performance at detecting salient objects in outdoors imagery --- sometimes exceeding human performance --- despite wide variations in imaging conditions, targets to be detected, and environments. The present paper focuses on a recently completed parallelization of the model, which runs at 30 frames/s on a 16-CPU Beowulf cluster, and on the enhancement of this real-time model to include motion cues in addition to the previously studied color, intensity and orientation cues. The parallel model architecture and its deployment onto Linux Beowulf clusters are described, as well as several examples of applications to real-time outdoors color video streams. The model proves very robust at detecting salient targets from live video streams, despite large possible variations in illumination, rapid camera jitter, clutter, or omnipresent optical flow (e.g., when used on a moving vehicle). The success of this approach suggests that the neuromorphic architecture described may represent a robust and efficient real-time machine vision front-end, which can be used in conjunction with more detailed localized object recognition and identification algorithms to be applied at the selected salient locations.

@inproceedings{ Itti02hvei,
  author = { L. Itti },
  title = { Real-Time High-Performance Attention Focusing in Outdoors
Color Video Streams},
  year = {2002},
  month = {Jan},
  pages = {235-243},
  abstract = { When confronted with cluttered natural environments,
animals still perform orders of magnitude better than artificial
vision systems in tasks such as orienting, target detection,
navigation and scene understanding. The recent widespread availability
of significant computational resources, however, in particular through
the deployment of so-called "Beowulf" clusters of low-cost personal
computers, leaves us little excuse for the enormous gap still
separating biological from machine vision systems.  We describe a
neuromorphic model of how our visual attention is attracted towards
conspicuous locations in a visual scene.  It replicates processing in
posterior parietal cortex and other brain areas along the dorsal
visual stream in the primate brain. The model includes a bottom-up
(image-based) computation of low-level color, intensity, orientation
and motion features, as well as a non-linear spatial competition which
enhances salient locations in each of these feature channels.  All
feature channels feed into a unique scalar "saliency map" which
controls where to next focus attention onto. Because it includes a
detailed low-level vision front-end, the model has been applied not
only to laboratory stimuli, but also to a wide variety of natural
scenes. In addition to predicting a wealth of psychophysical
experiments, the model demonstrated remarkable performance at
detecting salient objects in outdoors imagery --- sometimes exceeding
human performance --- despite wide variations in imaging conditions,
targets to be detected, and environments.  The present paper focuses
on a recently completed parallelization of the model, which runs at 30
frames/s on a 16-CPU Beowulf cluster, and on the enhancement of this
real-time model to include motion cues in addition to the previously
studied color, intensity and orientation cues. The parallel model
architecture and its deployment onto Linux Beowulf clusters are
described, as well as several examples of applications to real-time
outdoors color video streams. The model proves very robust at
detecting salient targets from live video streams, despite large
possible variations in illumination, rapid camera jitter, clutter, or
omnipresent optical flow (e.g., when used on a moving vehicle).  The
success of this approach suggests that the neuromorphic architecture
described may represent a robust and efficient real-time machine
vision front-end, which can be used in conjunction with more detailed
localized object recognition and identification algorithms to be
applied at the selected salient locations.},
  booktitle = { Proc. SPIE Human Vision and Electronic Imaging VII
(HVEI'02), San Jose, CA },
  editor = {B. Rogowitz and T. N. Pappas},
  publisher = {SPIE Press},
  address = {Bellingham, WA},
  type = { mod;bu;cv },
  file = { http://iLab.usc.edu/publications/doc/Itti02hvei.pdf },
  review = {abs/conf}
}

Downloads: 0

{"_id":{"_str":"5298a1a19eb585cc2600095d"},"__v":0,"authorIDs":[],"author_short":["Itti, L."],"bibbaseid":"itti-realtimehighperformanceattentionfocusinginoutdoorscolorvideostreams-2002","bibdata":{"html":"<div class=\"bibbase_paper\"> \n\n\n<span class=\"bibbase_paper_titleauthoryear\">\n\t<span class=\"bibbase_paper_title\"><a name=\"Itti02hvei\"> </a>Real-Time High-Performance Attention Focusing in Outdoors Color Video Streams.</span>\n\t<span class=\"bibbase_paper_author\">\nItti, L.</span>\n\t\n</span>\n\n\n\nIn\nRogowitz, B.; and Pappas, T. N., editor, <i>Proc. SPIE Human Vision and Electronic Imaging VII (HVEI'02), San Jose, CA</i>, page 235-243, Bellingham, WA, Jan 2002.\n\n\nSPIE Press.\n\n\n\n\n<br class=\"bibbase_paper_content\"/>\n\n<span class=\"bibbase_paper_content\">\n \n \n \n <a href=\"javascript:showBib('Itti02hvei')\"\n class=\"bibbase link\">\n \n\t\n\t\n\t\n BibTeX\n <i class=\"fa fa-caret-down\"></i></a>\n \n \n  \n <a class=\"bibbase_abstract_link bibbase link\"\n href=\"javascript:showAbstract('Itti02hvei')\">\n Abstract\n <i class=\"fa fa-caret-down\"></i></a>\n \n \n \n\n \n \n \n</span>\n\n<div class=\"well well-small bibbase\" id=\"bib_Itti02hvei\"\n style=\"display:none\">\n <pre>@inproceedings{ Itti02hvei,\n author = { L. Itti },\n title = { Real-Time High-Performance Attention Focusing in Outdoors\nColor Video Streams},\n year = {2002},\n month = {Jan},\n pages = {235-243},\n abstract = { When confronted with cluttered natural environments,\nanimals still perform orders of magnitude better than artificial\nvision systems in tasks such as orienting, target detection,\nnavigation and scene understanding. The recent widespread availability\nof significant computational resources, however, in particular through\nthe deployment of so-called \"Beowulf\" clusters of low-cost personal\ncomputers, leaves us little excuse for the enormous gap still\nseparating biological from machine vision systems. We describe a\nneuromorphic model of how our visual attention is attracted towards\nconspicuous locations in a visual scene. It replicates processing in\nposterior parietal cortex and other brain areas along the dorsal\nvisual stream in the primate brain. The model includes a bottom-up\n(image-based) computation of low-level color, intensity, orientation\nand motion features, as well as a non-linear spatial competition which\nenhances salient locations in each of these feature channels. All\nfeature channels feed into a unique scalar \"saliency map\" which\ncontrols where to next focus attention onto. Because it includes a\ndetailed low-level vision front-end, the model has been applied not\nonly to laboratory stimuli, but also to a wide variety of natural\nscenes. In addition to predicting a wealth of psychophysical\nexperiments, the model demonstrated remarkable performance at\ndetecting salient objects in outdoors imagery --- sometimes exceeding\nhuman performance --- despite wide variations in imaging conditions,\ntargets to be detected, and environments. The present paper focuses\non a recently completed parallelization of the model, which runs at 30\nframes/s on a 16-CPU Beowulf cluster, and on the enhancement of this\nreal-time model to include motion cues in addition to the previously\nstudied color, intensity and orientation cues. The parallel model\narchitecture and its deployment onto Linux Beowulf clusters are\ndescribed, as well as several examples of applications to real-time\noutdoors color video streams. The model proves very robust at\ndetecting salient targets from live video streams, despite large\npossible variations in illumination, rapid camera jitter, clutter, or\nomnipresent optical flow (e.g., when used on a moving vehicle). The\nsuccess of this approach suggests that the neuromorphic architecture\ndescribed may represent a robust and efficient real-time machine\nvision front-end, which can be used in conjunction with more detailed\nlocalized object recognition and identification algorithms to be\napplied at the selected salient locations.},\n booktitle = { Proc. SPIE Human Vision and Electronic Imaging VII\n(HVEI'02), San Jose, CA },\n editor = {B. Rogowitz and T. N. Pappas},\n publisher = {SPIE Press},\n address = {Bellingham, WA},\n type = { mod;bu;cv },\n file = { http://iLab.usc.edu/publications/doc/Itti02hvei.pdf },\n review = {abs/conf}\n}</pre>\n</div>\n\n\n<div class=\"well well-small bibbase\" id=\"abstract_Itti02hvei\"\n style=\"display:none\">\n When confronted with cluttered natural environments, animals still perform orders of magnitude better than artificial vision systems in tasks such as orienting, target detection, navigation and scene understanding. The recent widespread availability of significant computational resources, however, in particular through the deployment of so-called \"Beowulf\" clusters of low-cost personal computers, leaves us little excuse for the enormous gap still separating biological from machine vision systems. We describe a neuromorphic model of how our visual attention is attracted towards conspicuous locations in a visual scene. It replicates processing in posterior parietal cortex and other brain areas along the dorsal visual stream in the primate brain. The model includes a bottom-up (image-based) computation of low-level color, intensity, orientation and motion features, as well as a non-linear spatial competition which enhances salient locations in each of these feature channels. All feature channels feed into a unique scalar \"saliency map\" which controls where to next focus attention onto. Because it includes a detailed low-level vision front-end, the model has been applied not only to laboratory stimuli, but also to a wide variety of natural scenes. In addition to predicting a wealth of psychophysical experiments, the model demonstrated remarkable performance at detecting salient objects in outdoors imagery --- sometimes exceeding human performance --- despite wide variations in imaging conditions, targets to be detected, and environments. The present paper focuses on a recently completed parallelization of the model, which runs at 30 frames/s on a 16-CPU Beowulf cluster, and on the enhancement of this real-time model to include motion cues in addition to the previously studied color, intensity and orientation cues. The parallel model architecture and its deployment onto Linux Beowulf clusters are described, as well as several examples of applications to real-time outdoors color video streams. The model proves very robust at detecting salient targets from live video streams, despite large possible variations in illumination, rapid camera jitter, clutter, or omnipresent optical flow (e.g., when used on a moving vehicle). The success of this approach suggests that the neuromorphic architecture described may represent a robust and efficient real-time machine vision front-end, which can be used in conjunction with more detailed localized object recognition and identification algorithms to be applied at the selected salient locations.\n</div>\n\n\n</div>\n","downloads":0,"bibbaseid":"itti-realtimehighperformanceattentionfocusinginoutdoorscolorvideostreams-2002","role":"author","year":"2002","type":"mod;bu;cv","title":"Real-Time High-Performance Attention Focusing in Outdoors Color Video Streams","review":"abs/conf","publisher":"SPIE Press","pages":"235-243","month":"Jan","key":"Itti02hvei","id":"Itti02hvei","file":"http://iLab.usc.edu/publications/doc/Itti02hvei.pdf","editor_short":["Rogowitz, B.","Pappas, T.<nbsp>N."],"editor":["Rogowitz, B.","Pappas, T. N."],"booktitle":"Proc. SPIE Human Vision and Electronic Imaging VII (HVEI'02), San Jose, CA","bibtype":"inproceedings","bibtex":"@inproceedings{ Itti02hvei,\n author = { L. Itti },\n title = { Real-Time High-Performance Attention Focusing in Outdoors\nColor Video Streams},\n year = {2002},\n month = {Jan},\n pages = {235-243},\n abstract = { When confronted with cluttered natural environments,\nanimals still perform orders of magnitude better than artificial\nvision systems in tasks such as orienting, target detection,\nnavigation and scene understanding. The recent widespread availability\nof significant computational resources, however, in particular through\nthe deployment of so-called \"Beowulf\" clusters of low-cost personal\ncomputers, leaves us little excuse for the enormous gap still\nseparating biological from machine vision systems. We describe a\nneuromorphic model of how our visual attention is attracted towards\nconspicuous locations in a visual scene. It replicates processing in\nposterior parietal cortex and other brain areas along the dorsal\nvisual stream in the primate brain. The model includes a bottom-up\n(image-based) computation of low-level color, intensity, orientation\nand motion features, as well as a non-linear spatial competition which\nenhances salient locations in each of these feature channels. All\nfeature channels feed into a unique scalar \"saliency map\" which\ncontrols where to next focus attention onto. Because it includes a\ndetailed low-level vision front-end, the model has been applied not\nonly to laboratory stimuli, but also to a wide variety of natural\nscenes. In addition to predicting a wealth of psychophysical\nexperiments, the model demonstrated remarkable performance at\ndetecting salient objects in outdoors imagery --- sometimes exceeding\nhuman performance --- despite wide variations in imaging conditions,\ntargets to be detected, and environments. The present paper focuses\non a recently completed parallelization of the model, which runs at 30\nframes/s on a 16-CPU Beowulf cluster, and on the enhancement of this\nreal-time model to include motion cues in addition to the previously\nstudied color, intensity and orientation cues. The parallel model\narchitecture and its deployment onto Linux Beowulf clusters are\ndescribed, as well as several examples of applications to real-time\noutdoors color video streams. The model proves very robust at\ndetecting salient targets from live video streams, despite large\npossible variations in illumination, rapid camera jitter, clutter, or\nomnipresent optical flow (e.g., when used on a moving vehicle). The\nsuccess of this approach suggests that the neuromorphic architecture\ndescribed may represent a robust and efficient real-time machine\nvision front-end, which can be used in conjunction with more detailed\nlocalized object recognition and identification algorithms to be\napplied at the selected salient locations.},\n booktitle = { Proc. SPIE Human Vision and Electronic Imaging VII\n(HVEI'02), San Jose, CA },\n editor = {B. Rogowitz and T. N. Pappas},\n publisher = {SPIE Press},\n address = {Bellingham, WA},\n type = { mod;bu;cv },\n file = { http://iLab.usc.edu/publications/doc/Itti02hvei.pdf },\n review = {abs/conf}\n}","author_short":["Itti, L."],"author":["Itti, L."],"address":"Bellingham, WA","abstract":"When confronted with cluttered natural environments, animals still perform orders of magnitude better than artificial vision systems in tasks such as orienting, target detection, navigation and scene understanding. The recent widespread availability of significant computational resources, however, in particular through the deployment of so-called \"Beowulf\" clusters of low-cost personal computers, leaves us little excuse for the enormous gap still separating biological from machine vision systems. We describe a neuromorphic model of how our visual attention is attracted towards conspicuous locations in a visual scene. It replicates processing in posterior parietal cortex and other brain areas along the dorsal visual stream in the primate brain. The model includes a bottom-up (image-based) computation of low-level color, intensity, orientation and motion features, as well as a non-linear spatial competition which enhances salient locations in each of these feature channels. All feature channels feed into a unique scalar \"saliency map\" which controls where to next focus attention onto. Because it includes a detailed low-level vision front-end, the model has been applied not only to laboratory stimuli, but also to a wide variety of natural scenes. In addition to predicting a wealth of psychophysical experiments, the model demonstrated remarkable performance at detecting salient objects in outdoors imagery --- sometimes exceeding human performance --- despite wide variations in imaging conditions, targets to be detected, and environments. The present paper focuses on a recently completed parallelization of the model, which runs at 30 frames/s on a 16-CPU Beowulf cluster, and on the enhancement of this real-time model to include motion cues in addition to the previously studied color, intensity and orientation cues. The parallel model architecture and its deployment onto Linux Beowulf clusters are described, as well as several examples of applications to real-time outdoors color video streams. The model proves very robust at detecting salient targets from live video streams, despite large possible variations in illumination, rapid camera jitter, clutter, or omnipresent optical flow (e.g., when used on a moving vehicle). The success of this approach suggests that the neuromorphic architecture described may represent a robust and efficient real-time machine vision front-end, which can be used in conjunction with more detailed localized object recognition and identification algorithms to be applied at the selected salient locations."},"bibtype":"inproceedings","biburl":"http://ilab.usc.edu/publications/src/ilab.bib","downloads":0,"search_terms":["real","time","high","performance","attention","focusing","outdoors","color","video","streams","itti"],"title":"Real-Time High-Performance Attention Focusing in Outdoors Color Video Streams","year":2002,"dataSources":["wedBDxEpNXNCLZ2sZ"]}