Audio-Visual and Visual-Only Speech and Speaker Recognition

Audio-Visual and Visual-Only Speech and Speaker Recognition. Shiell, D. J., Terry, L. H., Aleksic, P. S., & Katsaggelos, A. K. In Visual Speech Recognition, pages 1–38. IGI Global, 2009.

Paper doi abstract bibtex

The information imbedded in the visual dynamics of speech has the potential to improve the performance of speech and speaker recognition systems. The information carried in the visual speech signal compliments the information in the acoustic speech signal, which is particularly beneficial in adverse acoustic environments. Non-invasive methods using low-cost sensors can be used to obtain acoustic and visual biometric signals, such as a person's voice and lip movement, with little user cooperation. These types of unobtrusive biometric systems are warranted to promote widespread adoption of biometric technology in today's society. In this chapter, the authors describe the main components and theory of audio-visual and visual-only speech and speaker recognition systems. Audio-visual corpora are described and a number of speech and speaker recognition systems are reviewed. Finally, various open issues about the system design and implementation, and present future research and development directions in this area are discussed. © 2009, IGI Global.

@incollection{Derek2009,
abstract = {The information imbedded in the visual dynamics of speech has the potential to improve the performance of speech and speaker recognition systems. The information carried in the visual speech signal compliments the information in the acoustic speech signal, which is particularly beneficial in adverse acoustic environments. Non-invasive methods using low-cost sensors can be used to obtain acoustic and visual biometric signals, such as a person's voice and lip movement, with little user cooperation. These types of unobtrusive biometric systems are warranted to promote widespread adoption of biometric technology in today's society. In this chapter, the authors describe the main components and theory of audio-visual and visual-only speech and speaker recognition systems. Audio-visual corpora are described and a number of speech and speaker recognition systems are reviewed. Finally, various open issues about the system design and implementation, and present future research and development directions in this area are discussed. {\textcopyright} 2009, IGI Global.},
author = {Shiell, Derek J. and Terry, Louis H. and Aleksic, Petar S. and Katsaggelos, Aggelos K.},
booktitle = {Visual Speech Recognition},
doi = {10.4018/978-1-60566-186-5.ch001},
isbn = {9781605661865},
pages = {1--38},
publisher = {IGI Global},
title = {{Audio-Visual and Visual-Only Speech and Speaker Recognition}},
url = {http://services.igi-global.com/resolvedoi/resolve.aspx?doi=10.4018/978-1-60566-186-5.ch001},
year = {2009}
}

Downloads: 0

{"_id":"KjfHieZTDKikJMatm","bibbaseid":"shiell-terry-aleksic-katsaggelos-audiovisualandvisualonlyspeechandspeakerrecognition-2009","author_short":["Shiell, D. J.","Terry, L. H.","Aleksic, P. S.","Katsaggelos, A. K."],"bibdata":{"bibtype":"incollection","type":"incollection","abstract":"The information imbedded in the visual dynamics of speech has the potential to improve the performance of speech and speaker recognition systems. The information carried in the visual speech signal compliments the information in the acoustic speech signal, which is particularly beneficial in adverse acoustic environments. Non-invasive methods using low-cost sensors can be used to obtain acoustic and visual biometric signals, such as a person's voice and lip movement, with little user cooperation. These types of unobtrusive biometric systems are warranted to promote widespread adoption of biometric technology in today's society. In this chapter, the authors describe the main components and theory of audio-visual and visual-only speech and speaker recognition systems. Audio-visual corpora are described and a number of speech and speaker recognition systems are reviewed. Finally, various open issues about the system design and implementation, and present future research and development directions in this area are discussed. © 2009, IGI Global.","author":[{"propositions":[],"lastnames":["Shiell"],"firstnames":["Derek","J."],"suffixes":[]},{"propositions":[],"lastnames":["Terry"],"firstnames":["Louis","H."],"suffixes":[]},{"propositions":[],"lastnames":["Aleksic"],"firstnames":["Petar","S."],"suffixes":[]},{"propositions":[],"lastnames":["Katsaggelos"],"firstnames":["Aggelos","K."],"suffixes":[]}],"booktitle":"Visual Speech Recognition","doi":"10.4018/978-1-60566-186-5.ch001","isbn":"9781605661865","pages":"1–38","publisher":"IGI Global","title":"Audio-Visual and Visual-Only Speech and Speaker Recognition","url":"http://services.igi-global.com/resolvedoi/resolve.aspx?doi=10.4018/978-1-60566-186-5.ch001","year":"2009","bibtex":"@incollection{Derek2009,\nabstract = {The information imbedded in the visual dynamics of speech has the potential to improve the performance of speech and speaker recognition systems. The information carried in the visual speech signal compliments the information in the acoustic speech signal, which is particularly beneficial in adverse acoustic environments. Non-invasive methods using low-cost sensors can be used to obtain acoustic and visual biometric signals, such as a person's voice and lip movement, with little user cooperation. These types of unobtrusive biometric systems are warranted to promote widespread adoption of biometric technology in today's society. In this chapter, the authors describe the main components and theory of audio-visual and visual-only speech and speaker recognition systems. Audio-visual corpora are described and a number of speech and speaker recognition systems are reviewed. Finally, various open issues about the system design and implementation, and present future research and development directions in this area are discussed. {\\textcopyright} 2009, IGI Global.},\nauthor = {Shiell, Derek J. and Terry, Louis H. and Aleksic, Petar S. and Katsaggelos, Aggelos K.},\nbooktitle = {Visual Speech Recognition},\ndoi = {10.4018/978-1-60566-186-5.ch001},\nisbn = {9781605661865},\npages = {1--38},\npublisher = {IGI Global},\ntitle = {{Audio-Visual and Visual-Only Speech and Speaker Recognition}},\nurl = {http://services.igi-global.com/resolvedoi/resolve.aspx?doi=10.4018/978-1-60566-186-5.ch001},\nyear = {2009}\n}\n","author_short":["Shiell, D. J.","Terry, L. H.","Aleksic, P. S.","Katsaggelos, A. K."],"key":"Derek2009","id":"Derek2009","bibbaseid":"shiell-terry-aleksic-katsaggelos-audiovisualandvisualonlyspeechandspeakerrecognition-2009","role":"author","urls":{"Paper":"http://services.igi-global.com/resolvedoi/resolve.aspx?doi=10.4018/978-1-60566-186-5.ch001"},"metadata":{"authorlinks":{}}},"bibtype":"incollection","biburl":"https://sites.northwestern.edu/ivpl/files/2023/06/IVPL_Updated_publications-1.bib","dataSources":["qhF8zxmGcJfvtdeAg","fvDEHD49E2ZRwE3fb","H7crv8NWhZup4d4by","DHqokWsryttGh7pJE","vRJd4wNg9HpoZSMHD","sYxQ6pxFgA59JRhxi","w2WahSbYrbcCKBDsC","XasdXLL99y5rygCmq","3gkSihZQRfAD2KBo3","t5XMbyZbtPBo4wBGS","bEpHM2CtrwW2qE8FP","teJzFLHexaz5AQW5z"],"keywords":[],"search_terms":["audio","visual","visual","speech","speaker","recognition","shiell","terry","aleksic","katsaggelos"],"title":"Audio-Visual and Visual-Only Speech and Speaker Recognition","year":2009}