Development of a Classification-based Eye Gaze estimation technique using an Integrated Laptop Camera: two models are better than one. In August, 2023. Zenodo.
Development of a Classification-based Eye Gaze estimation technique using an Integrated Laptop Camera: two models are better than one [link]Paper  doi  abstract   bibtex   
While there are many purely software-based solutions available for live head pose tracking, the same is not true for gaze estimation which is usually done using specialised hardware; typically, eye-trackers that use infrared light sources and special camera. The challenge when using vision-based Machine Learning methods to estimate gaze from an image of a user is that the features of a user's eyes that vary with gaze position are small and difficult to track. This paper describes a classification-based Convolutional Neural Network (CNN) system that can, from images of a user's face, estimate where on a screen they are looking in real time. Labelled images of an individual looking at specified regions of a screen were collected, and these images were used to train a Deep Learning model. The final model trained was found to be accurate enough to correctly guess which section of the screen (3x3 grid) the user is looking in over 99% of the time. The current system requires strict input conditions on head orientation, a limitation on the system. The integration of head pose estimation and methods of improving resolution are discussed. The approach uses two classifiers, one for horizontal gaze (3 regions) and another for vertical gaze (3 regions) combined to estimate gaze location. This approach was found to be more effective than a single classifier for nine separate regions. Using both eyes, excluding the nose region and other facial features of the input images also improved performance.
@inproceedings{cribbin_2023_8217852,
  title        = {Development of a Classification-based Eye Gaze 
                   estimation technique using an Integrated Laptop
                   Camera: two models are better than one},
  year         = {2023},
  publisher    = {Zenodo},
  month        = {August},
  abstract={While there are many purely software-based solutions available for live head pose tracking, the same is not true for gaze
estimation which is usually done using specialised hardware; typically, eye-trackers that use infrared light sources and
special camera. The challenge when using vision-based Machine Learning methods to estimate gaze from an image of a
user is that the features of a user's eyes that vary with gaze position are small and difficult to track. This paper describes
a classification-based Convolutional Neural Network (CNN) system that can, from images of a user's face, estimate where
on a screen they are looking in real time. Labelled images of an individual looking at specified regions of a screen were
collected, and these images were used to train a Deep Learning model. The final model trained was found to be accurate
enough to correctly guess which section of the screen (3x3 grid) the user is looking in over 99% of the time. The current
system requires strict input conditions on head orientation, a limitation on the system. The integration of head pose
estimation and methods of improving resolution are discussed. The approach uses two classifiers, one for horizontal gaze
(3 regions) and another for vertical gaze (3 regions) combined to estimate gaze location. This approach was found to be
more effective than a single classifier for nine separate regions. Using both eyes, excluding the nose region and other
facial features of the input images also improved performance.},
  doi          = {10.5281/zenodo.8217852},
  url          = {https://doi.org/10.5281/zenodo.8217852}
}

Downloads: 0