Regularized Fitted Q-Iteration: Application to Planning

Regularized Fitted Q-Iteration: Application to Planning. Farahmand, A., Ghavamzadeh, M., Szepesvári, C., & Mannor, S. In EWRL, pages 55–68, 2008.

Paper doi abstract bibtex 2 downloads

We consider planning in a Markovian decision problem, i.e., the problem of finding a good policy given access to a generative model of the environment. We propose to use fitted Q-iteration with penalized (or regularized) least-squares regression as the regression subroutine to address the problem of controlling model-complexity. The algorithm is presented in detail for the case when the function space is a reproducing kernel Hilbert space underlying a user-chosen kernel function. We derive bounds on the quality of the solution and argue that data-dependent penalties can lead to almost optimal performance. A simple example is used to illustrate the benefits of using a penalized procedure.

@inproceedings{farahmand2008,
	abstract = {We consider planning in a Markovian decision problem, i.e., the problem of finding a good policy given access to a generative model of the environment. We propose to use fitted Q-iteration with penalized (or regularized) least-squares regression as the regression subroutine to address the problem of controlling model-complexity. The algorithm is presented in detail for the case when the function space is a reproducing kernel Hilbert space underlying a user-chosen kernel function. We derive bounds on the quality of the solution and argue that data-dependent penalties can lead to almost optimal performance. A simple example is used to illustrate the benefits of using a penalized procedure.},
	author = {Farahmand, A.m. and Ghavamzadeh, M. and Szepesv{\'a}ri, Cs. and Mannor, S.},
	bibsource = {DBLP, http://dblp.uni-trier.de},
	booktitle = {EWRL},
	doi = {10.1007/978-3-540-89722-4_5},
	entrysubtype = {unrefereed},
	keywords = {reinforcement learning, planning, regularization, nonparametrics, theory, function approximation, value iteration},
	pages = {55--68},
	title = {Regularized Fitted Q-Iteration: Application to Planning},
	url_paper = {RegFQI-Plan-EWRL08.pdf},
	year = {2008},
	Bdsk-Url-1 = {http://dx.doi.org/10.1007/978-3-540-89722-4_5}}

Downloads: 2