Near-optimal Regret Bounds for Reinforcement Learning

Near-optimal Regret Bounds for Reinforcement Learning. Jaksch, T., Ortner, R., & Auer, P. Journal of Machine Learning Research, 11(Apr):1563–1600, 2010.
abstract bibtex

Abstract For undiscounted reinforcement learning in Markov decision processes (MDPs) we consider the total regret of a learning algorithm with respect to an optimal policy. In order to describe the transition structure of an MDP we propose a new parameter: An MDP has.

@Article{Jaksch2010,
author = {Jaksch, Thomas and Ortner, Ronald and Auer, Peter}, 
title = {Near-optimal Regret Bounds for Reinforcement Learning}, 
journal = {Journal of Machine Learning Research}, 
volume = {11}, 
number = {Apr}, 
pages = {1563--1600}, 
year = {2010}, 
abstract = {Abstract For undiscounted reinforcement learning in Markov decision processes (MDPs) we consider the total regret of a learning algorithm with respect to an optimal policy. In order to describe the transition structure of an MDP we propose a new parameter: An MDP has.}, 
location = {}, 
keywords = {}}

Downloads: 0

{"_id":"u8fpM3RxRQXBdJNTi","bibbaseid":"jaksch-ortner-auer-nearoptimalregretboundsforreinforcementlearning-2010","authorIDs":[],"author_short":["Jaksch, T.","Ortner, R.","Auer, P."],"bibdata":{"bibtype":"article","type":"article","author":[{"propositions":[],"lastnames":["Jaksch"],"firstnames":["Thomas"],"suffixes":[]},{"propositions":[],"lastnames":["Ortner"],"firstnames":["Ronald"],"suffixes":[]},{"propositions":[],"lastnames":["Auer"],"firstnames":["Peter"],"suffixes":[]}],"title":"Near-optimal Regret Bounds for Reinforcement Learning","journal":"Journal of Machine Learning Research","volume":"11","number":"Apr","pages":"1563–1600","year":"2010","abstract":"Abstract For undiscounted reinforcement learning in Markov decision processes (MDPs) we consider the total regret of a learning algorithm with respect to an optimal policy. In order to describe the transition structure of an MDP we propose a new parameter: An MDP has.","location":"","keywords":"","bibtex":"@Article{Jaksch2010,\nauthor = {Jaksch, Thomas and Ortner, Ronald and Auer, Peter}, \ntitle = {Near-optimal Regret Bounds for Reinforcement Learning}, \njournal = {Journal of Machine Learning Research}, \nvolume = {11}, \nnumber = {Apr}, \npages = {1563--1600}, \nyear = {2010}, \nabstract = {Abstract For undiscounted reinforcement learning in Markov decision processes (MDPs) we consider the total regret of a learning algorithm with respect to an optimal policy. In order to describe the transition structure of an MDP we propose a new parameter: An MDP has.}, \nlocation = {}, \nkeywords = {}}\n\n\n","author_short":["Jaksch, T.","Ortner, R.","Auer, P."],"key":"Jaksch2010","id":"Jaksch2010","bibbaseid":"jaksch-ortner-auer-nearoptimalregretboundsforreinforcementlearning-2010","role":"author","urls":{},"downloads":0},"bibtype":"article","biburl":"https://gist.githubusercontent.com/stuhlmueller/a37ef2ef4f378ebcb73d249fe0f8377a/raw/6f96f6f779501bd9482896af3e4db4de88c35079/references.bib","creationDate":"2020-01-27T02:13:34.426Z","downloads":0,"keywords":[],"search_terms":["near","optimal","regret","bounds","reinforcement","learning","jaksch","ortner","auer"],"title":"Near-optimal Regret Bounds for Reinforcement Learning","year":2010,"dataSources":["hEoKh4ygEAWbAZ5iy"]}