Towards Minimax Policies for Online Linear Optimization with Bandit Feedback. Bubeck, S., Cesa-Bianchi, N., & Kakade, S. In Shie, M., Srebro, N., & Williamson, R., editors, Proceedings of the 25th Annual Conference on Learning Theory (COLT), volume PMLR 23, pages 41.1–41.14, 2012.
Towards Minimax Policies for Online Linear Optimization with Bandit Feedback [pdf]Paper  bibtex   

Downloads: 0