Vapnik-Chervonenkis dimension of recurrent neural networks. Koiran, P. & Sontag, E. Discrete Appl. Math., 86(1):63–79, Elsevier Science Publishers B. V., Amsterdam, The Netherlands, The Netherlands, 1998.
doi  abstract   bibtex   
This paper provides lower and upper bounds for the VC dimension of recurrent networks. Several types of activation functions are discussed, including threshold, polynomial, piecewise-polynomial and sigmoidal functions. The bounds depend on two independent parameters: the number w of weights in the network, and the length k of the input sequence. Ignoring multiplicative constants, the main results say roughly the following: 1. For architectures whose activation is any fixed nonlinear polynomial, the VC dimension is proportional to wk. 2. For architectures whose activation is any fixed piecewise polynomial, the VC dimension is between wk and w**2k. 3. For architectures with threshold activations, the VC dimension is between wlog(k/w) and the smallest of wklog(wk) and w**2+wlog(wk). 4. For the standard sigmoid tanh(x), the VC dimension is between wk and w**4 k**2.
@ARTICLE{MR1634875,
   AUTHOR       = {P. Koiran and E.D. Sontag},
   JOURNAL      = {Discrete Appl. Math.},
   TITLE        = {Vapnik-Chervonenkis dimension of recurrent neural 
      networks},
   YEAR         = {1998},
   OPTMONTH     = {},
   OPTNOTE      = {},
   NUMBER       = {1},
   PAGES        = {63--79},
   VOLUME       = {86},
   ADDRESS      = {Amsterdam, The Netherlands, The Netherlands},
   KEYWORDS     = {neural networks, recurrent neural networks},
   PUBLISHER    = {Elsevier Science Publishers B. V.},
   PDF          = {../../FTPDIR/koiran-dam.pdf},
   ABSTRACT     = { This paper provides lower and upper bounds for the VC 
      dimension of recurrent networks. Several types of activation 
      functions are discussed, including threshold, polynomial, 
      piecewise-polynomial and sigmoidal functions. The bounds depend on 
      two independent parameters: the number w of weights in the network, 
      and the length k of the input sequence. Ignoring multiplicative 
      constants, the main results say roughly the following: 1. For 
      architectures whose activation is any fixed nonlinear polynomial, the 
      VC dimension is proportional to wk. 2. For architectures whose 
      activation is any fixed piecewise polynomial, the VC dimension is 
      between wk and w**2k. 3. For architectures with threshold 
      activations, the VC dimension is between wlog(k/w) and the smallest 
      of wklog(wk) and w**2+wlog(wk). 4. For the standard sigmoid tanh(x), 
      the VC dimension is between wk and w**4 k**2. },
   DOI          = {http://dx.doi.org/10.1016/S0166-218X(98)00014-6}
}

Downloads: 0