Attention-based mechanism for head-body fusion gaze estimation in dynamic scenes. Zhang, W., Xiong, J., Dong, X., Wang, Q., & Quan, G. In 2024 6th International Conference on Internet of Things, Automation and Artificial Intelligence, IoTAAI 2024, pages 484-489, 2024. Institute of Electrical and Electronics Engineers Inc.. Paper doi abstract bibtex The gaze of an individual can be understood as the information conveyed in the interaction between the individual and the surroundings. Computers face significant challenges in estimating the gaze direction of pedestrians in real-world scenarios. The presence of blurry images or occlusion of the tester's eyes makes the estimation particularly challenging. The majority of gaze estimation methodologies are contingent upon the availability of high-definition facial or ocular images. To address this issue, we propose a noval gaze estimation method that integrates head and torso features with the objective of achieving gaze estimation in authentic camera settings. In order to address the challenge of accurately extracting gaze features in the presence of background noise, we propose a CBAM-based feature extractor. Furthermore, we propose a linear regression method for multi-channel feature fusion using Bi-LSTM to address ambiguities due to changes in gaze time, enabling the estimation of the user's gaze direction. Our method accurately predicts the user's gaze direction, head deflection direction, body deflection direction, angular error and confidence by analysing seven consecutive frames of images.
@inproceedings{
title = {Attention-based mechanism for head-body fusion gaze estimation in dynamic scenes},
type = {inproceedings},
year = {2024},
keywords = {Bi-LSTM,CBAM,feature fusion,gaze estimation},
pages = {484-489},
publisher = {Institute of Electrical and Electronics Engineers Inc.},
id = {5ceb6585-5ca2-39ff-8649-c879a61d2e8d},
created = {2024-11-06T13:03:59.513Z},
file_attached = {true},
profile_id = {f1f70cad-e32d-3de2-a3c0-be1736cb88be},
group_id = {5ec9cc91-a5d6-3de5-82f3-3ef3d98a89c1},
last_modified = {2024-11-07T07:33:10.444Z},
read = {true},
starred = {false},
authored = {false},
confirmed = {false},
hidden = {false},
folder_uuids = {4cda9297-f98e-4246-88d6-ffeeade205c3},
private_publication = {false},
abstract = {The gaze of an individual can be understood as the information conveyed in the interaction between the individual and the surroundings. Computers face significant challenges in estimating the gaze direction of pedestrians in real-world scenarios. The presence of blurry images or occlusion of the tester's eyes makes the estimation particularly challenging. The majority of gaze estimation methodologies are contingent upon the availability of high-definition facial or ocular images. To address this issue, we propose a noval gaze estimation method that integrates head and torso features with the objective of achieving gaze estimation in authentic camera settings. In order to address the challenge of accurately extracting gaze features in the presence of background noise, we propose a CBAM-based feature extractor. Furthermore, we propose a linear regression method for multi-channel feature fusion using Bi-LSTM to address ambiguities due to changes in gaze time, enabling the estimation of the user's gaze direction. Our method accurately predicts the user's gaze direction, head deflection direction, body deflection direction, angular error and confidence by analysing seven consecutive frames of images.},
bibtype = {inproceedings},
author = {Zhang, Wulue and Xiong, Jianbin and Dong, Xiangjun and Wang, Qi and Quan, Guoyuan},
doi = {10.1109/IoTAAI62601.2024.10692563},
booktitle = {2024 6th International Conference on Internet of Things, Automation and Artificial Intelligence, IoTAAI 2024}
}
Downloads: 0
{"_id":"SPEQapoy6Jkr826qx","bibbaseid":"zhang-xiong-dong-wang-quan-attentionbasedmechanismforheadbodyfusiongazeestimationindynamicscenes-2024","author_short":["Zhang, W.","Xiong, J.","Dong, X.","Wang, Q.","Quan, G."],"bibdata":{"title":"Attention-based mechanism for head-body fusion gaze estimation in dynamic scenes","type":"inproceedings","year":"2024","keywords":"Bi-LSTM,CBAM,feature fusion,gaze estimation","pages":"484-489","publisher":"Institute of Electrical and Electronics Engineers Inc.","id":"5ceb6585-5ca2-39ff-8649-c879a61d2e8d","created":"2024-11-06T13:03:59.513Z","file_attached":"true","profile_id":"f1f70cad-e32d-3de2-a3c0-be1736cb88be","group_id":"5ec9cc91-a5d6-3de5-82f3-3ef3d98a89c1","last_modified":"2024-11-07T07:33:10.444Z","read":"true","starred":false,"authored":false,"confirmed":false,"hidden":false,"folder_uuids":"4cda9297-f98e-4246-88d6-ffeeade205c3","private_publication":false,"abstract":"The gaze of an individual can be understood as the information conveyed in the interaction between the individual and the surroundings. Computers face significant challenges in estimating the gaze direction of pedestrians in real-world scenarios. The presence of blurry images or occlusion of the tester's eyes makes the estimation particularly challenging. The majority of gaze estimation methodologies are contingent upon the availability of high-definition facial or ocular images. To address this issue, we propose a noval gaze estimation method that integrates head and torso features with the objective of achieving gaze estimation in authentic camera settings. In order to address the challenge of accurately extracting gaze features in the presence of background noise, we propose a CBAM-based feature extractor. Furthermore, we propose a linear regression method for multi-channel feature fusion using Bi-LSTM to address ambiguities due to changes in gaze time, enabling the estimation of the user's gaze direction. Our method accurately predicts the user's gaze direction, head deflection direction, body deflection direction, angular error and confidence by analysing seven consecutive frames of images.","bibtype":"inproceedings","author":"Zhang, Wulue and Xiong, Jianbin and Dong, Xiangjun and Wang, Qi and Quan, Guoyuan","doi":"10.1109/IoTAAI62601.2024.10692563","booktitle":"2024 6th International Conference on Internet of Things, Automation and Artificial Intelligence, IoTAAI 2024","bibtex":"@inproceedings{\n title = {Attention-based mechanism for head-body fusion gaze estimation in dynamic scenes},\n type = {inproceedings},\n year = {2024},\n keywords = {Bi-LSTM,CBAM,feature fusion,gaze estimation},\n pages = {484-489},\n publisher = {Institute of Electrical and Electronics Engineers Inc.},\n id = {5ceb6585-5ca2-39ff-8649-c879a61d2e8d},\n created = {2024-11-06T13:03:59.513Z},\n file_attached = {true},\n profile_id = {f1f70cad-e32d-3de2-a3c0-be1736cb88be},\n group_id = {5ec9cc91-a5d6-3de5-82f3-3ef3d98a89c1},\n last_modified = {2024-11-07T07:33:10.444Z},\n read = {true},\n starred = {false},\n authored = {false},\n confirmed = {false},\n hidden = {false},\n folder_uuids = {4cda9297-f98e-4246-88d6-ffeeade205c3},\n private_publication = {false},\n abstract = {The gaze of an individual can be understood as the information conveyed in the interaction between the individual and the surroundings. Computers face significant challenges in estimating the gaze direction of pedestrians in real-world scenarios. The presence of blurry images or occlusion of the tester's eyes makes the estimation particularly challenging. The majority of gaze estimation methodologies are contingent upon the availability of high-definition facial or ocular images. To address this issue, we propose a noval gaze estimation method that integrates head and torso features with the objective of achieving gaze estimation in authentic camera settings. In order to address the challenge of accurately extracting gaze features in the presence of background noise, we propose a CBAM-based feature extractor. Furthermore, we propose a linear regression method for multi-channel feature fusion using Bi-LSTM to address ambiguities due to changes in gaze time, enabling the estimation of the user's gaze direction. Our method accurately predicts the user's gaze direction, head deflection direction, body deflection direction, angular error and confidence by analysing seven consecutive frames of images.},\n bibtype = {inproceedings},\n author = {Zhang, Wulue and Xiong, Jianbin and Dong, Xiangjun and Wang, Qi and Quan, Guoyuan},\n doi = {10.1109/IoTAAI62601.2024.10692563},\n booktitle = {2024 6th International Conference on Internet of Things, Automation and Artificial Intelligence, IoTAAI 2024}\n}","author_short":["Zhang, W.","Xiong, J.","Dong, X.","Wang, Q.","Quan, G."],"urls":{"Paper":"https://bibbase.org/service/mendeley/bfbbf840-4c42-3914-a463-19024f50b30c/file/6d765caa-f026-5b69-6366-b58173ef8d26/Attention_based_mechanism_for_head_body_fusion_gaze_estimation_in_dynamic_scenes.pdf.pdf"},"biburl":"https://bibbase.org/service/mendeley/bfbbf840-4c42-3914-a463-19024f50b30c","bibbaseid":"zhang-xiong-dong-wang-quan-attentionbasedmechanismforheadbodyfusiongazeestimationindynamicscenes-2024","role":"author","keyword":["Bi-LSTM","CBAM","feature fusion","gaze estimation"],"metadata":{"authorlinks":{}},"downloads":0},"bibtype":"inproceedings","biburl":"https://bibbase.org/service/mendeley/bfbbf840-4c42-3914-a463-19024f50b30c","dataSources":["2252seNhipfTmjEBQ"],"keywords":["bi-lstm","cbam","feature fusion","gaze estimation"],"search_terms":["attention","based","mechanism","head","body","fusion","gaze","estimation","dynamic","scenes","zhang","xiong","dong","wang","quan"],"title":"Attention-based mechanism for head-body fusion gaze estimation in dynamic scenes","year":2024}