Effects of stimulus content and duration on talker identification. Bricker, P. D and Pruzansky, S. The Journal of the Acoustical Society of America, 40(6):1441-1449.
doi  abstract   bibtex   
Sixteen listeners attempted to identify the talker when listening to speech samples of varying duration and content. The samples, recorded by 10 different talkers, were of five types: excerpted vowels, excerpted consonant-vowel (CV) sequences, monosyllabic words, disyllabic nonsense words, and sentences. Identification accuracy improved directly with the number of phonemes in the sample even when duration was controlled. Stimulus-response matrices differed substantially between the two vowels ([a] and [i]) used in the vowel and CV samples: relative identifiability of the talkers, response preference, and error patterns were all found to depend on vowel type. Confusion matrices for a given vowel exhibit definite asymmetries. In a limited additional study, subsets of listeners made identifying responses with the tapes reversed; performance deteriorated on even the briefest excerpts. The results pose some difficulties for a model of talker-identification behavior based on attributes of voice quality.
@article{bricker_effects_1996,
	Author = {Bricker, Peter D and Pruzansky, Sandra},
	Date = {1996},
	Date-Modified = {2017-04-19 08:04:06 +0000},
	Doi = {10.1121/1.1910246},
	Journal = {The Journal of the Acoustical Society of America},
	Keywords = {forensic, phonetics, speech perception},
	Number = {6},
	Pages = {1441-1449},
	Title = {Effects of stimulus content and duration on talker identification},
	Volume = {40},
	Abstract = {Sixteen listeners attempted to identify the talker when listening to speech samples of varying duration and content. The samples, recorded by 10 different talkers, were of five types: excerpted vowels, excerpted consonant-vowel (CV) sequences, monosyllabic words, disyllabic nonsense words, and sentences. Identification accuracy improved directly with the number of phonemes in the sample even when duration was controlled. Stimulus-response matrices differed substantially between the two vowels ([a] and [i]) used in the vowel and CV samples: relative identifiability of the talkers, response preference, and error patterns were all found to depend on vowel type. Confusion matrices for a given vowel exhibit definite asymmetries. In a limited additional study, subsets of listeners made identifying responses with the tapes reversed; performance deteriorated on even the briefest excerpts. The results pose some difficulties for a model of talker-identification behavior based on attributes of voice quality.},
	Bdsk-Url-1 = {http://dx.doi.org/10.1121/1.1910246}}
Downloads: 0