Policy mirror descent inherently explores action space
From MaRDI portal
Publication:6663113
DOI10.1137/23m1560215MaRDI QIDQ6663113
Publication date: 14 January 2025
Published in: SIAM Journal on Optimization (Search for Journal in Brave)
Analysis of algorithms and problem complexity (68Q25) Nonconvex programming, global optimization (90C26) Stochastic programming (90C15) Markov and semi-Markov decision processes (90C40)
This page was built for publication: Policy mirror descent inherently explores action space