A method for speaker verification. Doddington, G. R The Journal of the Acoustical Society of America, 49(1A):139.
A method for speaker verification [link]Paper  doi  abstract   bibtex   
The speaker‐verification problem is defined and contrasted with the speaker‐identification problem. A speaker‐verification experiment is performed using eight known speakers and 32 impostors. Formant frequencies, voicing pitch period, and speech energy---all as functions of time---are used in verification. Proper time normalization is shown to be an important factor in improving verification error performance. Nonlinear time normalization is performed by maximizing the correlation between sample and reference second‐formant profiles through a piecewise linear continuous transformation of time. Average error rates after time normalization were: for pitch, 0.05; for formants, 0.04; for energy, 0.04; and over‐all, 0.015. This over‐all error rate is four times less than that obtained using only utterance endpoint alignment.
@article{doddington_method_1971,
	Author = {Doddington, George R},
	Date = {1971},
	Date-Modified = {2017-04-19 08:04:06 +0000},
	Doi = {10.1121/1.1975906},
	File = {Attachment:files/3103/Doddington - 1971 - A method for speaker verification.pdf:application/pdf},
	Issn = {00014966},
	Journal = {The Journal of the Acoustical Society of America},
	Keywords = {speaker recognition, speech technology},
	Number = {1A},
	Pages = {139},
	Title = {A method for speaker verification},
	Url = {http://link.aip.org/link/JASMAN/v49/i1A/p139/s2&Agg=doi},
	Volume = {49},
	Abstract = {The speaker‐verification problem is defined and contrasted with the speaker‐identification problem. A speaker‐verification experiment is performed using eight known speakers and 32 impostors. Formant frequencies, voicing pitch period, and speech energy---all as functions of time---are used in verification. Proper time normalization is shown to be an important factor in improving verification error performance. Nonlinear time normalization is performed by maximizing the correlation between sample and reference second‐formant profiles through a piecewise linear continuous transformation of time. Average error rates after time normalization were: for pitch, 0.05; for formants, 0.04; for energy, 0.04; and over‐all, 0.015. This over‐all error rate is four times less than that obtained using only utterance endpoint alignment.},
	Bdsk-File-1 = {YnBsaXN0MDDUAQIDBAUGJCVYJHZlcnNpb25YJG9iamVjdHNZJGFyY2hpdmVyVCR0b3ASAAGGoKgHCBMUFRYaIVUkbnVsbNMJCgsMDxJXTlMua2V5c1pOUy5vYmplY3RzViRjbGFzc6INDoACgAOiEBGABIAFgAdccmVsYXRpdmVQYXRoWWFsaWFzRGF0YV8QTS4uLy4uLy4uL0JpYmxpb2dyYWZpYS9QYXBlcnMvRG9kZGluZ3Rvbi9BIG1ldGhvZCBmb3Igc3BlYWtlciB2ZXJpZmljYXRpb24ucGRm0hcLGBlXTlMuZGF0YU8RAh4AAAAAAh4AAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAMv2H85IKwAAEIZqaB9BIG1ldGhvZCBmb3Igc3BlYWsjMTA4NjZBNjkucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQhmpp1AnTMAAAAAAAAAAAAAMABAAACSAAAAAAAAAAAAAAAAAAAAAKRG9kZGluZ3RvbgAQAAgAAMv2A64AAAARAAgAANQJtxAAAAABABQQhmpoEIZljgAF/EcABfuYAADARgACAGhNYWNpbnRvc2ggSEQ6VXNlcnM6AGpvYXF1aW1fbGxpc3RlcnJpOgBCaWJsaW9ncmFmaWE6AFBhcGVyczoARG9kZGluZ3RvbjoAQSBtZXRob2QgZm9yIHNwZWFrIzEwODY2QTY5LnBkZgAOAEwAJQBBACAAbQBlAHQAaABvAGQAIABmAG8AcgAgAHMAcABlAGEAawBlAHIAIAB2AGUAcgBpAGYAaQBjAGEAdABpAG8AbgAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAXFVzZXJzL2pvYXF1aW1fbGxpc3RlcnJpL0JpYmxpb2dyYWZpYS9QYXBlcnMvRG9kZGluZ3Rvbi9BIG1ldGhvZCBmb3Igc3BlYWtlciB2ZXJpZmljYXRpb24ucGRmABMAAS8AABUAAgAY//8AAIAG0hscHR5aJGNsYXNzbmFtZVgkY2xhc3Nlc11OU011dGFibGVEYXRhox0fIFZOU0RhdGFYTlNPYmplY3TSGxwiI1xOU0RpY3Rpb25hcnmiIiBfEA9OU0tleWVkQXJjaGl2ZXLRJidUcm9vdIABAAgAEQAaACMALQAyADcAQABGAE0AVQBgAGcAagBsAG4AcQBzAHUAdwCEAI4A3gDjAOsDDQMPAxQDHwMoAzYDOgNBA0oDTwNcA18DcQN0A3kAAAAAAAACAQAAAAAAAAAoAAAAAAAAAAAAAAAAAAADew==},
	Bdsk-Url-1 = {http://link.aip.org/link/JASMAN/v49/i1A/p139/s2&Agg=doi},
	Bdsk-Url-2 = {http://dx.doi.org/10.1121/1.1975906}}
Downloads: 0