OVERTRAINING, REGULARIZATION, AND SEARCHING FOR MINIMUM WITH APPLICATION TO NEURAL NETWORKS

OVERTRAINING, REGULARIZATION, AND SEARCHING FOR MINIMUM WITH APPLICATION TO NEURAL NETWORKS. Sjoberg, J & Ljung, L
abstract bibtex

In this paper we discuss the role of criterion minimization as a means for parameter estimation. Most traditional methods, such as maximum likelihood and prediction error identi cation are based on these principles. However, somewhat surprisingly, it turns out that it is not always "optimal" to try to nd the absolute minimum point of the criterion. The reason is that "stopped minimization" (where the iterations have been terminated before the absolute minimum has been reached) has more or less identical properties as using regularization (adding a parametric penalty term). Regularization is known to have bene cial e ects on the variance of the parameter estimates and it reduces the \variance contribution" of the mis t. This also explains the concept of \overtraining" in neural nets. How does one know when to terminate the iterations then? A useful criterion would be to stop iterations when the criterion function applied to a validation data set no longer decreases. However, we show in this paper, that applying this technique extensively may lead to the fact that the resulting estimate is an unregularized estimate for the total data set: Estimation + validation data.

@article{sjoberg_overtraining_nodate,
	title = {{OVERTRAINING}, {REGULARIZATION}, {AND} {SEARCHING} {FOR} {MINIMUM} {WITH} {APPLICATION} {TO} {NEURAL} {NETWORKS}},
	abstract = {In this paper we discuss the role of criterion minimization as a means for parameter estimation. Most traditional methods, such as maximum likelihood and prediction error identi cation are based on these principles. However, somewhat surprisingly, it turns out that it is not always "optimal" to try to nd the absolute minimum point of the criterion. The reason is that "stopped minimization" (where the iterations have been terminated before the absolute minimum has been reached) has more or less identical properties as using regularization (adding a parametric penalty term). Regularization is known to have bene cial e ects on the variance of the parameter estimates and it reduces the {\textbackslash}variance contribution" of the mis t. This also explains the concept of {\textbackslash}overtraining" in neural nets. How does one know when to terminate the iterations then? A useful criterion would be to stop iterations when the criterion function applied to a validation data set no longer decreases. However, we show in this paper, that applying this technique extensively may lead to the fact that the resulting estimate is an unregularized estimate for the total data set: Estimation + validation data.},
	language = {en},
	author = {Sjoberg, J and Ljung, L},
	pages = {18},
}

Downloads: 0

{"_id":"o7SuXY55N6ajkm2vJ","bibbaseid":"sjoberg-ljung-overtrainingregularizationandsearchingforminimumwithapplicationtoneuralnetworks","author_short":["Sjoberg, J","Ljung, L"],"bibdata":{"bibtype":"article","type":"article","title":"OVERTRAINING, REGULARIZATION, AND SEARCHING FOR MINIMUM WITH APPLICATION TO NEURAL NETWORKS","abstract":"In this paper we discuss the role of criterion minimization as a means for parameter estimation. Most traditional methods, such as maximum likelihood and prediction error identi cation are based on these principles. However, somewhat surprisingly, it turns out that it is not always \"optimal\" to try to nd the absolute minimum point of the criterion. The reason is that \"stopped minimization\" (where the iterations have been terminated before the absolute minimum has been reached) has more or less identical properties as using regularization (adding a parametric penalty term). Regularization is known to have bene cial e ects on the variance of the parameter estimates and it reduces the \\variance contribution\" of the mis t. This also explains the concept of \\overtraining\" in neural nets. How does one know when to terminate the iterations then? A useful criterion would be to stop iterations when the criterion function applied to a validation data set no longer decreases. However, we show in this paper, that applying this technique extensively may lead to the fact that the resulting estimate is an unregularized estimate for the total data set: Estimation + validation data.","language":"en","author":[{"propositions":[],"lastnames":["Sjoberg"],"firstnames":["J"],"suffixes":[]},{"propositions":[],"lastnames":["Ljung"],"firstnames":["L"],"suffixes":[]}],"pages":"18","bibtex":"@article{sjoberg_overtraining_nodate,\n\ttitle = {{OVERTRAINING}, {REGULARIZATION}, {AND} {SEARCHING} {FOR} {MINIMUM} {WITH} {APPLICATION} {TO} {NEURAL} {NETWORKS}},\n\tabstract = {In this paper we discuss the role of criterion minimization as a means for parameter estimation. Most traditional methods, such as maximum likelihood and prediction error identi cation are based on these principles. However, somewhat surprisingly, it turns out that it is not always \"optimal\" to try to nd the absolute minimum point of the criterion. The reason is that \"stopped minimization\" (where the iterations have been terminated before the absolute minimum has been reached) has more or less identical properties as using regularization (adding a parametric penalty term). Regularization is known to have bene cial e ects on the variance of the parameter estimates and it reduces the {\\textbackslash}variance contribution\" of the mis t. This also explains the concept of {\\textbackslash}overtraining\" in neural nets. How does one know when to terminate the iterations then? A useful criterion would be to stop iterations when the criterion function applied to a validation data set no longer decreases. However, we show in this paper, that applying this technique extensively may lead to the fact that the resulting estimate is an unregularized estimate for the total data set: Estimation + validation data.},\n\tlanguage = {en},\n\tauthor = {Sjoberg, J and Ljung, L},\n\tpages = {18},\n}\n\n","author_short":["Sjoberg, J","Ljung, L"],"key":"sjoberg_overtraining_nodate","id":"sjoberg_overtraining_nodate","bibbaseid":"sjoberg-ljung-overtrainingregularizationandsearchingforminimumwithapplicationtoneuralnetworks","role":"author","urls":{},"metadata":{"authorlinks":{}},"html":""},"bibtype":"article","biburl":"https://bibbase.org/zotero/mxmplx","dataSources":["aXmRAq63YsH7a3ufx"],"keywords":[],"search_terms":["overtraining","regularization","searching","minimum","application","neural","networks","sjoberg","ljung"],"title":"OVERTRAINING, REGULARIZATION, AND SEARCHING FOR MINIMUM WITH APPLICATION TO NEURAL NETWORKS","year":null}