Mathematical Research Data Initiative
Main page
Recent changes
Random page
SPARQL
MaRDI@GitHub
In other projects
MaRDI portal item
Discussion
View source
View history
Purge
English
Log in

Homotopic policy mirror descent: policy convergence, algorithmic regularization, and improved sample complexity

From MaRDI portal
Publication:6608040
Jump to:navigation, search

DOI10.1007/s10107-023-02017-4MaRDI QIDQ6608040

Yan Li, Guanghui Lan, Tuo Zhao

Publication date: 19 September 2024

Published in: Mathematical Programming. Series A. Series B (Search for Journal in Brave)



zbMATH Keywords

sample complexitypolicy convergencelocal accelerationpolicy gradient method


Mathematics Subject Classification ID

Analysis of algorithms and problem complexity (68Q25) Nonconvex programming, global optimization (90C26) Stochastic programming (90C15) Markov and semi-Markov decision processes (90C40)








This page was built for publication: Homotopic policy mirror descent: policy convergence, algorithmic regularization, and improved sample complexity

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:6608040&oldid=40158146"
Tools
What links here
Related changes
Special pages
Printable version
Permanent link
Page information
This page was last edited on 13 February 2025, at 18:30.
Privacy policy
About MaRDI portal
Disclaimers
Imprint
Powered by MediaWiki