A hierarchical Dirichlet language model. MacKay, D. J. & Peto, L. B. Natural language engineering, 1(3):289–307, September, 1995.
abstract   bibtex   
We discuss a hierarchical probabilistic model whose predictions are similar to those of the popular language modelling procedure known as `smoothing'. A number of interesting differences from smoothing emerge. The insights gained from a probabilistic view of this problem point towards new directions for language modelling. The ideas of this paper are also applicable to other problems, such as the modelling of triphomes in speech and DNA and protein sequences in molecular biology. The new algorithm is compared with smoothing on a two-million word corpus. The methods prove to be about equally accurate, with the hierarchical model using fewer computational resources.
@Article{	  peto1,
  author	= {David J.C. MacKay and Linda Bauman Peto},
  title		= {A hierarchical Dirichlet language model},
  journal	= {Natural language engineering},
  volume	= {1},
  number	= {3},
  month		= {September},
  year		= {1995},
  pages		= {289--307},
  abstract	= {We discuss a hierarchical probabilistic model whose
		  predictions are similar to those of the popular language
		  modelling procedure known as `smoothing'. A number of
		  interesting differences from smoothing emerge. The insights
		  gained from a probabilistic view of this problem point
		  towards new directions for language modelling. The ideas of
		  this paper are also applicable to other problems, such as
		  the modelling of triphomes in speech and DNA and protein
		  sequences in molecular biology. The new algorithm is
		  compared with smoothing on a two-million word corpus. The
		  methods prove to be about equally accurate, with the
		  hierarchical model using fewer computational resources.}
}

Downloads: 0