Policy Mirror Descent for Reinforcement Learning: Linear Convergence, New Sampling Complexity, and Generalized Problem Classes

From MaRDI portal
Publication:6359420

DOI10.1007/s10107-022-01816-5zbMath1512.90150arXiv2102.00135WikidataQ114852452 ScholiaQ114852452MaRDI QIDQ6359420

Guanghui Lan

Publication date: 29 January 2021











This page was built for publication: Policy Mirror Descent for Reinforcement Learning: Linear Convergence, New Sampling Complexity, and Generalized Problem Classes