Automatic speech recognition in the diagnosis of primary progressive aphasia. Fraser, K. C., Rudzicz, F., Graham, N., & Rochon, E. In Proceedings of SLPAT 2013, 4th Workshop on Speech and Language Processing for Assistive Technologies, pages 47–54, Grenoble, France, 2013.
abstract   bibtex   
Narrative speech can provide a valuable source of information about an individual's linguistic abilities across lexical, syntactic, and pragmatic levels. However, analysis of narrative speech is typically done by hand, and is therefore extremely time-consuming. Use of automatic speech recognition (ASR) software could make this type of analysis more efficient and widely available. In this paper, we present the results of an initial attempt to use ASR technology to generate transcripts of spoken narratives from participants with semantic dementia (SD), progressive nonfluent aphasia (PNFA), and healthy controls. We extract text features from the transcripts and use these features, alone and in combination with acoustic features from the speech signals, to classify transcripts as patient versus control, and SD versus PNFA. Additionally, we generate artificially noisy transcripts by applying insertions, substitutions, and deletions to manually-transcribed data, allowing experiments to be conducted across a wider range of noise levels than are pro- duced by a tuned ASR system. We find that reasonably good classification accuracies can be achieved by selecting appropriate features from the noisy transcripts. We also find that the choice of using ASR data or manually transcribed data as the training set can have a strong effect on the accuracy of the classifiers.
@InProceedings{	  fraser2013a,
  author	= {Kathleen C. Fraser and Frank Rudzicz and Naida Graham and
		  Elizabeth Rochon},
  title		= {Automatic speech recognition in the diagnosis of primary
		  progressive aphasia},
  address	= {Grenoble, France},
  booktitle	= {Proceedings of SLPAT 2013, 4th Workshop on Speech and
		  Language Processing for Assistive Technologies},
  pages		= {47--54},
  year		= {2013},
  download	= {http://ftp.cs.toronto.edu/pub/gh/Fraser-etal-SLPAT-2013.pdf}
		  ,
  abstract	= {Narrative speech can provide a valuable source of
		  information about an individual's linguistic abilities
		  across lexical, syntactic, and pragmatic levels. However,
		  analysis of narrative speech is typically done by hand, and
		  is therefore extremely time-consuming. Use of automatic
		  speech recognition (ASR) software could make this type of
		  analysis more efficient and widely available. In this
		  paper, we present the results of an initial attempt to use
		  ASR technology to generate transcripts of spoken narratives
		  from participants with semantic dementia (SD), progressive
		  nonfluent aphasia (PNFA), and healthy controls. We extract
		  text features from the transcripts and use these features,
		  alone and in combination with acoustic features from the
		  speech signals, to classify transcripts as patient versus
		  control, and SD versus PNFA. Additionally, we generate
		  artificially noisy transcripts by applying insertions,
		  substitutions, and deletions to manually-transcribed data,
		  allowing experiments to be conducted across a wider range
		  of noise levels than are pro- duced by a tuned ASR system.
		  We find that reasonably good classification accuracies can
		  be achieved by selecting appropriate features from the
		  noisy transcripts. We also find that the choice of using
		  ASR data or manually transcribed data as the training set
		  can have a strong effect on the accuracy of the
		  classifiers.}
}

Downloads: 0