In IEEE Conference on Decision and Control (CDC), pages 6840-6847, 2018. Pdf Link doi abstract bibtex
It is critical to obtain stability certificate before deploying reinforcement learning in real-world mission-critical systems. This study justifies the intuition that smoothness (i.e., small changes in inputs lead to small changes in outputs) is an important property for stability-certified reinforcement learning from a control-theoretic perspective. The smoothness margin can be obtained by solving a feasibility problem based on semi-definite programming for both linear and nonlinear dynamical systems, and it does not need to access the exact parameters of the learned controllers. Numerical evaluation on nonlinear and decentralized frequency control for large-scale power grids demonstrates that the smoothness margin can certify stability during both exploration and deployment for (deep) neural-network policies, which substantially surpass nominal controllers in performance. The study opens up new opportunities for robust Lipschitz continuous policy learning.