Robust exploration in linear quadratic reinforcement learning

arXiv1906.01584MaRDI QIDQ6319965

Author name not available (Why is that?)

Publication date: 4 June 2019

Abstract: This paper concerns the problem of learning control policies for an unknown linear dynamical system to minimize a quadratic cost function. We present a method, based on convex optimization, that accomplishes this task robustly: i.e., we minimize the worst-case cost, accounting for system uncertainty given the observed data. The method balances exploitation and exploration, exciting the system in such a way so as to reduce uncertainty in the model parameters to which the worst-case cost is most sensitive. Numerical simulations and application to a hardware-in-the-loop servo-mechanism demonstrate the approach, with appreciable performance and robustness gains over alternative methods observed in both.

Has companion code repository: https://github.com/umenberger/robust-exploration

This page was built for publication: Robust exploration in linear quadratic reinforcement learning

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6319965)