Minimum Classification Error Training in Exponential Language Models

Minimum Classification Error Training in Exponential Language Models. Paciorek, C & Rosenfeld, R In NIST/DARPA Speech Transcription Workshop, May, 2000. Carnegie Mellon University.
abstract bibtex

Minimum Classification Error (MCE) training is difficult to apply to language modeling due to inherent scarcity of training data (N-best lists). However, a whole-sentence exponential language model is particularly suitable forMCE training, because it can use a relatively small number of powerful features to capture global sentential phenomena. We review the model, discuss feature induction, find features in both the Broadcast News and Switchboard domains, and build an MCE-trained model for the latter. Our experiments show that even models with relatively few features are prone to overfitting and are sensitive to initial parameter setting, leading us to examine alternative weight optimization criteria and search algorithms.

@inproceedings{paciorek_minimum_2000,
	title = {Minimum {Classification} {Error} {Training} in {Exponential} {Language} {Models}},
	abstract = {Minimum Classification Error (MCE) training is difficult to apply to language modeling due to inherent scarcity of training data (N-best lists). However, a whole-sentence exponential language model is particularly suitable forMCE training, because it can use a relatively small number of powerful features to capture global sentential phenomena. We review the model, discuss feature induction, find features in both the Broadcast News and Switchboard domains, and build an MCE-trained model for the latter. Our experiments show that even models with relatively few features are prone to overfitting and are sensitive to initial parameter setting, leading us to examine alternative weight optimization criteria and search algorithms.},
	booktitle = {{NIST}/{DARPA} {Speech} {Transcription} {Workshop}},
	publisher = {Carnegie Mellon University},
	author = {Paciorek, C and Rosenfeld, R},
	month = may,
	year = {2000},
	keywords = {Statistical / Machine Learning Methods in Speech and Language Processing},
}

Downloads: 0

{"_id":"SPWP48drnr9dLtqxi","bibbaseid":"paciorek-rosenfeld-minimumclassificationerrortraininginexponentiallanguagemodels-2000","downloads":0,"creationDate":"2016-06-29T19:16:08.428Z","title":"Minimum Classification Error Training in Exponential Language Models","author_short":["Paciorek, C","Rosenfeld, R"],"year":2000,"bibtype":"inproceedings","biburl":"https://api.zotero.org/users/5636389/collections/8RG6RK86/items/top?format=bibtex&recursive=1&limit=100&key=lo1KVmBiVRveHF1eNrgQn1PM","bibdata":{"bibtype":"inproceedings","type":"inproceedings","title":"Minimum Classification Error Training in Exponential Language Models","abstract":"Minimum Classification Error (MCE) training is difficult to apply to language modeling due to inherent scarcity of training data (N-best lists). However, a whole-sentence exponential language model is particularly suitable forMCE training, because it can use a relatively small number of powerful features to capture global sentential phenomena. We review the model, discuss feature induction, find features in both the Broadcast News and Switchboard domains, and build an MCE-trained model for the latter. Our experiments show that even models with relatively few features are prone to overfitting and are sensitive to initial parameter setting, leading us to examine alternative weight optimization criteria and search algorithms.","booktitle":"NIST/DARPA Speech Transcription Workshop","publisher":"Carnegie Mellon University","author":[{"propositions":[],"lastnames":["Paciorek"],"firstnames":["C"],"suffixes":[]},{"propositions":[],"lastnames":["Rosenfeld"],"firstnames":["R"],"suffixes":[]}],"month":"May","year":"2000","keywords":"Statistical / Machine Learning Methods in Speech and Language Processing","bibtex":"@inproceedings{paciorek_minimum_2000,\n\ttitle = {Minimum {Classification} {Error} {Training} in {Exponential} {Language} {Models}},\n\tabstract = {Minimum Classification Error (MCE) training is difficult to apply to language modeling due to inherent scarcity of training data (N-best lists). However, a whole-sentence exponential language model is particularly suitable forMCE training, because it can use a relatively small number of powerful features to capture global sentential phenomena. We review the model, discuss feature induction, find features in both the Broadcast News and Switchboard domains, and build an MCE-trained model for the latter. Our experiments show that even models with relatively few features are prone to overfitting and are sensitive to initial parameter setting, leading us to examine alternative weight optimization criteria and search algorithms.},\n\tbooktitle = {{NIST}/{DARPA} {Speech} {Transcription} {Workshop}},\n\tpublisher = {Carnegie Mellon University},\n\tauthor = {Paciorek, C and Rosenfeld, R},\n\tmonth = may,\n\tyear = {2000},\n\tkeywords = {Statistical / Machine Learning Methods in Speech and Language Processing},\n}\n\n","author_short":["Paciorek, C","Rosenfeld, R"],"key":"paciorek_minimum_2000","id":"paciorek_minimum_2000","bibbaseid":"paciorek-rosenfeld-minimumclassificationerrortraininginexponentiallanguagemodels-2000","role":"author","urls":{},"keyword":["Statistical / Machine Learning Methods in Speech and Language Processing"],"metadata":{"authorlinks":{}},"downloads":0},"search_terms":["minimum","classification","error","training","exponential","language","models","paciorek","rosenfeld"],"keywords":["statistical / machine learning methods in speech and language processing"],"authorIDs":[],"dataSources":["kQqCE6irCXYpDG9Gc","KDPjNhwvnT5c28g2y","aT2Bdqd3yGr5yRaP6","maXnNhf9hb89ouggd","kLLAwa6Pf92DFud7m","BKKwLBm9aCDcLbeCx","i8mmszhKPeNCyZZxv","bPc8CYbKuhZRFYTyw","GZEddczNBavvZyiJH","HgadyzdqnhGJk6PFG","JShfBMnfWmtQqtHFa","MohkSYNTEsLoXRdm6"]}