Mathematical Research Data Initiative
Main page
Recent changes
Random page
Help about MediaWiki
Create a new Item
Create a new Property
Create a new EntitySchema
Merge two items
In other projects
Discussion
View source
View history
Purge
English
Log in

Approximate gradient methods in policy-space optimization of Markov reward processes

From MaRDI portal
Publication:1870312
Jump to:navigation, search

DOI10.1023/A:1022145020786zbMath1042.93061OpenAlexW1554366315MaRDI QIDQ1870312

Peter Marbach, John N. Tsitsiklis

Publication date: 11 May 2003

Published in: Discrete Event Dynamic Systems (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1023/a:1022145020786


zbMATH Keywords

simulation-based optimizationMarkov reward processespolicy-space optimization


Mathematics Subject Classification ID

Discrete-time control/observation systems (93C55) Optimal stochastic control (93E20)


Related Items (6)

Environment-driven distributed evolutionary adaptation in a population of autonomous robotic agents ⋮ Modeling and optimization of a product-service system with additional service capacity and impatient customers ⋮ Simulation-based optimization of Markov decision processes: an empirical process theory approach ⋮ Analysis and improvement of policy gradient estimation ⋮ Deep Reinforcement Learning: A State-of-the-Art Walkthrough ⋮ Approximation of average cost Markov decision processes using empirical distributions and concentration inequalities




This page was built for publication: Approximate gradient methods in policy-space optimization of Markov reward processes

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:1870312&oldid=14262594"
Tools
What links here
Related changes
Special pages
Printable version
Permanent link
Page information
MaRDI portal item
This page was last edited on 1 February 2024, at 12:43.
Privacy policy
About MaRDI portal
Disclaimers
Imprint
Powered by MediaWiki