A reinforcement learning method with closed-loop stability guarantee

From MaRDI portal
Publication:6343677

arXiv2006.14034MaRDI QIDQ6343677

Stefan Streif, Lukas Beckenbach, Thomas Göhrt, Pavel Osinenko

Publication date: 24 June 2020

Abstract: Reinforcement learning (RL) in the context of control systems offers wide possibilities of controller adaptation. Given an infinite-horizon cost function, the so-called critic of RL approximates it with a neural net and sends this information to the controller (called "actor"). However, the issue of closed-loop stability under an RL-method is still not fully addressed. Since the critic delivers merely an approximation to the value function of the corresponding infinite-horizon problem, no guarantee can be given in general as to whether the actor's actions stabilize the system. Different approaches to this issue exist. The current work offers a particular one, which, starting with a (not necessarily smooth) control Lyapunov function (CLF), derives an online RL-scheme in such a way that practical semi-global stability property of the closed-loop can be established. The approach logically continues the work of the authors on parameterized controllers and Lyapunov-like constraints for RL, whereas the CLF now appears merely in one of the constraints of the control scheme. The analysis of the closed-loop behavior is done in a sample-and-hold (SH) manner thus offering a certain insight into the digital realization. The case study with a non-holonomic integrator shows the capabilities of the derived method to optimize the given cost function compared to a nominal stabilizing controller.




Has companion code repository: https://github.com/pavel-osinenko/rcognita








This page was built for publication: A reinforcement learning method with closed-loop stability guarantee

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6343677)