On the automatic recognition of continuous speech: Implications from a spectrogram-reading experiment. Klatt, D. H and Stevens, K. N IEEE Transactions on Audio and Electroacoustics, 21(3):210-217.
doi  abstract   bibtex   
An experiment was performed in which the authors attempted to recognize a set of unknown sentences by visual examination of spectrograms and machine-aided lexical searching. Ninteen sentences representing data from five talkers were analyzed. An initial partial transcription in terms of phonetic features was performed. The transcription contained many errors and omissions: 10 percent of the segments were omitted, 17 percent were incorrectly transcribed, and an additional 40 percent were transcribed only partially in terms of phonetic features. The transcription was used by the experimenters to initiate computerized scans of a 200-word lexicon. A majority of the search responses did not contain the correct word. However, following extended interactions with the computer, a word-recognition rate of 96 percent was achieved by each investigator for the sentence material. Implications for automatic speech recognition are discussed. In particular, the differences between the phonetic characteristics of isolated words and of the same words when they appear in sentences are emphasized.
@article{klatt_automatic_1973,
	Author = {Klatt, Dennis H and Stevens, Kenneth N},
	Date = {1973},
	Date-Modified = {2017-04-19 08:04:07 +0000},
	Doi = {10.1109/TAU.1973.1162453},
	Journal = {IEEE Transactions on Audio and Electroacoustics},
	Keywords = {acoustic phonetics, phonetics, speech technology},
	Number = {3},
	Pages = {210-217},
	Title = {On the automatic recognition of continuous speech: Implications from a spectrogram-reading experiment},
	Volume = {21},
	Abstract = {An experiment was performed in which the authors attempted to recognize a set of unknown sentences by visual examination of spectrograms and machine-aided lexical searching. Ninteen sentences representing data from five talkers were analyzed. An initial partial transcription in terms of phonetic features was performed. The transcription contained many errors and omissions: 10 percent of the segments were omitted, 17 percent were incorrectly transcribed, and an additional 40 percent were transcribed only partially in terms of phonetic features. The transcription was used by the experimenters to initiate computerized scans of a 200-word lexicon. A majority of the search responses did not contain the correct word. However, following extended interactions with the computer, a word-recognition rate of 96 percent was achieved by each investigator for the sentence material. Implications for automatic speech recognition are discussed. In particular, the differences between the phonetic characteristics of isolated words and of the same words when they appear in sentences are emphasized.},
	Bdsk-Url-1 = {http://dx.doi.org/10.1109/TAU.1973.1162453}}
Downloads: 0