Optimal rates for zero-order convex optimization: the power of two function evaluations

Optimal rates for zero-order convex optimization: the power of two function evaluations. Duchi, J. C., Jordan, M. I., Wainwright, M. J., & Wibisono, A. August, 2014.

Paper abstract bibtex

We consider derivative-free algorithms for stochastic and non-stochastic convex optimization problems that use only function values rather than gradients. Focusing on non-asymptotic bounds on convergence rates, we show that if pairs of function values are available, algorithms for \$d\$-dimensional optimization that use gradient estimates based on random perturbations suffer a factor of at most \$\sqrt{d}\$ in convergence rate over traditional stochastic gradient methods. We establish such results for both smooth and non-smooth cases, sharpening previous analyses that suggested a worse dimension dependence, and extend our results to the case of multiple (\$m \ge 2\$) evaluations. We complement our algorithmic development with information-theoretic lower bounds on the minimax convergence rate of such problems, establishing the sharpness of our achievable results up to constant (sometimes logarithmic) factors.

@misc{duchi2014optimal,
  abstract = {We consider derivative-free algorithms for stochastic and non-stochastic
convex optimization problems that use only function values rather than
gradients. Focusing on non-asymptotic bounds on convergence rates, we show that
if pairs of function values are available, algorithms for \$d\$-dimensional
optimization that use gradient estimates based on random perturbations suffer a
factor of at most \$\sqrt{d}\$ in convergence rate over traditional stochastic
gradient methods. We establish such results for both smooth and non-smooth
cases, sharpening previous analyses that suggested a worse dimension
dependence, and extend our results to the case of multiple (\$m \ge 2\$)
evaluations. We complement our algorithmic development with
information-theoretic lower bounds on the minimax convergence rate of such
problems, establishing the sharpness of our achievable results up to constant
(sometimes logarithmic) factors.},
  added-at = {2018-12-07T09:10:16.000+0100},
  archiveprefix = {arXiv},
  author = {Duchi, John C. and Jordan, Michael I. and Wainwright, Martin J. and Wibisono, Andre},
  biburl = {https://www.bibsonomy.org/bibtex/28b0e71de07069134b4bf4c3f3545554d/jpvaldes},
  citeulike-article-id = {12834238},
  citeulike-linkout-0 = {http://arxiv.org/abs/1312.2139},
  citeulike-linkout-1 = {http://arxiv.org/pdf/1312.2139},
  day = 20,
  eprint = {1312.2139},
  interhash = {670afef60e4881e0959d86fd8719ae01},
  intrahash = {8b0e71de07069134b4bf4c3f3545554d},
  keywords = {optimization derivative_free},
  month = aug,
  posted-at = {2017-04-04 15:49:53},
  priority = {2},
  timestamp = {2018-12-07T09:17:38.000+0100},
  title = {Optimal rates for zero-order convex optimization: the power of two function evaluations},
  url = {http://arxiv.org/abs/1312.2139},
  year = 2014
}

Downloads: 0

{"_id":"f3biuWRADCwmuRWiM","bibbaseid":"duchi-jordan-wainwright-wibisono-optimalratesforzeroorderconvexoptimizationthepoweroftwofunctionevaluations-2014","authorIDs":[],"author_short":["Duchi, J. C.","Jordan, M. I.","Wainwright, M. J.","Wibisono, A."],"bibdata":{"bibtype":"misc","type":"misc","abstract":"We consider derivative-free algorithms for stochastic and non-stochastic convex optimization problems that use only function values rather than gradients. Focusing on non-asymptotic bounds on convergence rates, we show that if pairs of function values are available, algorithms for \\$d\\$-dimensional optimization that use gradient estimates based on random perturbations suffer a factor of at most \\$\\sqrt{d}\\$ in convergence rate over traditional stochastic gradient methods. We establish such results for both smooth and non-smooth cases, sharpening previous analyses that suggested a worse dimension dependence, and extend our results to the case of multiple (\\$m \\ge 2\\$) evaluations. We complement our algorithmic development with information-theoretic lower bounds on the minimax convergence rate of such problems, establishing the sharpness of our achievable results up to constant (sometimes logarithmic) factors.","added-at":"2018-12-07T09:10:16.000+0100","archiveprefix":"arXiv","author":[{"propositions":[],"lastnames":["Duchi"],"firstnames":["John","C."],"suffixes":[]},{"propositions":[],"lastnames":["Jordan"],"firstnames":["Michael","I."],"suffixes":[]},{"propositions":[],"lastnames":["Wainwright"],"firstnames":["Martin","J."],"suffixes":[]},{"propositions":[],"lastnames":["Wibisono"],"firstnames":["Andre"],"suffixes":[]}],"biburl":"https://www.bibsonomy.org/bibtex/28b0e71de07069134b4bf4c3f3545554d/jpvaldes","citeulike-article-id":"12834238","citeulike-linkout-0":"http://arxiv.org/abs/1312.2139","citeulike-linkout-1":"http://arxiv.org/pdf/1312.2139","day":"20","eprint":"1312.2139","interhash":"670afef60e4881e0959d86fd8719ae01","intrahash":"8b0e71de07069134b4bf4c3f3545554d","keywords":"optimization derivative_free","month":"August","posted-at":"2017-04-04 15:49:53","priority":"2","timestamp":"2018-12-07T09:17:38.000+0100","title":"Optimal rates for zero-order convex optimization: the power of two function evaluations","url":"http://arxiv.org/abs/1312.2139","year":"2014","bibtex":"@misc{duchi2014optimal,\n abstract = {We consider derivative-free algorithms for stochastic and non-stochastic\r\nconvex optimization problems that use only function values rather than\r\ngradients. Focusing on non-asymptotic bounds on convergence rates, we show that\r\nif pairs of function values are available, algorithms for \\$d\\$-dimensional\r\noptimization that use gradient estimates based on random perturbations suffer a\r\nfactor of at most \\$\\sqrt{d}\\$ in convergence rate over traditional stochastic\r\ngradient methods. We establish such results for both smooth and non-smooth\r\ncases, sharpening previous analyses that suggested a worse dimension\r\ndependence, and extend our results to the case of multiple (\\$m \\ge 2\\$)\r\nevaluations. We complement our algorithmic development with\r\ninformation-theoretic lower bounds on the minimax convergence rate of such\r\nproblems, establishing the sharpness of our achievable results up to constant\r\n(sometimes logarithmic) factors.},\n added-at = {2018-12-07T09:10:16.000+0100},\n archiveprefix = {arXiv},\n author = {Duchi, John C. and Jordan, Michael I. and Wainwright, Martin J. and Wibisono, Andre},\n biburl = {https://www.bibsonomy.org/bibtex/28b0e71de07069134b4bf4c3f3545554d/jpvaldes},\n citeulike-article-id = {12834238},\n citeulike-linkout-0 = {http://arxiv.org/abs/1312.2139},\n citeulike-linkout-1 = {http://arxiv.org/pdf/1312.2139},\n day = 20,\n eprint = {1312.2139},\n interhash = {670afef60e4881e0959d86fd8719ae01},\n intrahash = {8b0e71de07069134b4bf4c3f3545554d},\n keywords = {optimization derivative_free},\n month = aug,\n posted-at = {2017-04-04 15:49:53},\n priority = {2},\n timestamp = {2018-12-07T09:17:38.000+0100},\n title = {Optimal rates for zero-order convex optimization: the power of two function evaluations},\n url = {http://arxiv.org/abs/1312.2139},\n year = 2014\n}\n\n","author_short":["Duchi, J. C.","Jordan, M. I.","Wainwright, M. J.","Wibisono, A."],"key":"duchi2014optimal","id":"duchi2014optimal","bibbaseid":"duchi-jordan-wainwright-wibisono-optimalratesforzeroorderconvexoptimizationthepoweroftwofunctionevaluations-2014","role":"author","urls":{"Paper":"http://arxiv.org/abs/1312.2139"},"keyword":["optimization derivative_free"],"downloads":0},"bibtype":"misc","biburl":"http://www.bibsonomy.org/bib/author/wainwright?items=1000","creationDate":"2019-07-11T17:45:21.556Z","downloads":0,"keywords":["optimization derivative_free"],"search_terms":["optimal","rates","zero","order","convex","optimization","power","two","function","evaluations","duchi","jordan","wainwright","wibisono"],"title":"Optimal rates for zero-order convex optimization: the power of two function evaluations","year":2014,"dataSources":["9g7YFYRTAHoh4aFeS"]}