Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions. Abbasi-Yadkori, Y., Bartlett, P. L., & Szepesvári, C. CoRR, 2013.
Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions [link]Paper  bibtex   
@article{DBLP:journals/corr/abs-1303-3055,
  author    = {Yasin Abbasi{-}Yadkori and
               Peter L. Bartlett and
               Csaba Szepesv{\'{a}}ri},
  title     = {Online Learning in Markov Decision Processes with Adversarially Chosen
               Transition Probability Distributions},
  journal   = {CoRR},
  volume    = {abs/1303.3055},
  year      = {2013},
  url       = {http://arxiv.org/abs/1303.3055},
  archivePrefix = {arXiv},
  eprint    = {1303.3055},
  timestamp = {Mon, 13 Aug 2018 01:00:00 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/abs-1303-3055.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Downloads: 0