PAC Bounds for Multi-armed Bandit and Markov Decision Processes. Even-Dar, E., Mannor, S., & Mansour, Y. In Kivinen, J. & Sloan, R. H., editors, Proceedings of the 15th Annual Conference on Computational Learning Theory (COLT 2002), volume 2375, of Lecture Notes in Computer Science, pages 255-270, Berlin / Heidelberg, Germany, 2002. Springer.
PAC Bounds for Multi-armed Bandit and Markov Decision Processes [pdf]Paper  bibtex   

Downloads: 0