A Comparison Between Spiking and Differentiable Recurrent Neural Networks on Spoken Digit Recognition

A Comparison Between Spiking and Differentiable Recurrent Neural Networks on Spoken Digit Recognition. Graves, A., Beringer, N., & Schmidhuber, J. In The 23rd IASTED International Conference on modelling, identification, and control, Grindelwald, 2004.
abstract bibtex

In this paper we demonstrate that Long Short-Term Memory (LSTM) is a differentiable recurrent neural net (RNN) capable of robustly categorizing timewarped speech data. We measure its performance on a spoken digit identification task, where the data was spike-encoded in such a way that classifying the utterances became a difficult challenge in non-linear timewarping. We find that LSTM gives greatly superior results to an SNN found in the literature, and conclude that the architecture has a place in domains that require the learning of large timewarped datasets, such as automatic speech recognition.

@INPROCEEDINGS{graves+beringer+schmidhuber:2004,
  AUTHOR = {A. Graves and N. Beringer and J. Schmidhuber},
  TITLE = {A Comparison Between Spiking and Differentiable Recurrent Neural Networks on Spoken Digit Recognition},
  BOOKTITLE = {The 23rd IASTED International Conference on modelling, identification, and control},
  ADDRESS = {Grindelwald},
  YEAR = {2004},
  SOURCE = {OwnPublication},
  ABSTRACT = {In this paper we demonstrate that Long Short-Term Memory (LSTM) is a differentiable recurrent neural net (RNN)
		capable of robustly categorizing timewarped speech data. We measure its performance on a spoken digit 
		identification task, where the data was spike-encoded in such a way that classifying the utterances became a 
		difficult challenge in non-linear timewarping. We find that LSTM gives greatly superior results to an SNN found in the
		literature, and conclude that the architecture has a place in domains that require the learning of large timewarped datasets, 
		such as automatic speech recognition.}
}

Downloads: 0

{"_id":"8XtajwyPc4icw6Jzo","bibbaseid":"graves-beringer-schmidhuber-acomparisonbetweenspikinganddifferentiablerecurrentneuralnetworksonspokendigitrecognition-2004","downloads":0,"creationDate":"2018-10-05T11:34:49.676Z","title":"A Comparison Between Spiking and Differentiable Recurrent Neural Networks on Spoken Digit Recognition","author_short":["Graves, A.","Beringer, N.","Schmidhuber, J."],"year":2004,"bibtype":"inproceedings","biburl":"http://people.idsia.ch/~juergen/deep.bib","bibdata":{"bibtype":"inproceedings","type":"inproceedings","author":[{"firstnames":["A."],"propositions":[],"lastnames":["Graves"],"suffixes":[]},{"firstnames":["N."],"propositions":[],"lastnames":["Beringer"],"suffixes":[]},{"firstnames":["J."],"propositions":[],"lastnames":["Schmidhuber"],"suffixes":[]}],"title":"A Comparison Between Spiking and Differentiable Recurrent Neural Networks on Spoken Digit Recognition","booktitle":"The 23rd IASTED International Conference on modelling, identification, and control","address":"Grindelwald","year":"2004","source":"OwnPublication","abstract":"In this paper we demonstrate that Long Short-Term Memory (LSTM) is a differentiable recurrent neural net (RNN) capable of robustly categorizing timewarped speech data. We measure its performance on a spoken digit identification task, where the data was spike-encoded in such a way that classifying the utterances became a difficult challenge in non-linear timewarping. We find that LSTM gives greatly superior results to an SNN found in the literature, and conclude that the architecture has a place in domains that require the learning of large timewarped datasets, such as automatic speech recognition.","bibtex":"@INPROCEEDINGS{graves+beringer+schmidhuber:2004,\n AUTHOR = {A. Graves and N. Beringer and J. Schmidhuber},\n TITLE = {A Comparison Between Spiking and Differentiable Recurrent Neural Networks on Spoken Digit Recognition},\n BOOKTITLE = {The 23rd IASTED International Conference on modelling, identification, and control},\n ADDRESS = {Grindelwald},\n YEAR = {2004},\n SOURCE = {OwnPublication},\n ABSTRACT = {In this paper we demonstrate that Long Short-Term Memory (LSTM) is a differentiable recurrent neural net (RNN)\n\t\tcapable of robustly categorizing timewarped speech data. We measure its performance on a spoken digit \n\t\tidentification task, where the data was spike-encoded in such a way that classifying the utterances became a \n\t\tdifficult challenge in non-linear timewarping. We find that LSTM gives greatly superior results to an SNN found in the\n\t\tliterature, and conclude that the architecture has a place in domains that require the learning of large timewarped datasets, \n\t\tsuch as automatic speech recognition.}\n}\n\n","author_short":["Graves, A.","Beringer, N.","Schmidhuber, J."],"key":"graves+beringer+schmidhuber:2004","id":"graves+beringer+schmidhuber:2004","bibbaseid":"graves-beringer-schmidhuber-acomparisonbetweenspikinganddifferentiablerecurrentneuralnetworksonspokendigitrecognition-2004","role":"author","urls":{},"downloads":0,"html":""},"search_terms":["comparison","between","spiking","differentiable","recurrent","neural","networks","spoken","digit","recognition","graves","beringer","schmidhuber"],"keywords":[],"authorIDs":[],"dataSources":["EmYaiv9TCHbg7caTW"]}