Fast convergence to state-action frequency polytopes for MDPs
From MaRDI portal
Publication:1015315
DOI10.1016/j.orl.2008.12.003zbMath1159.90512OpenAlexW2012559397MaRDI QIDQ1015315
Publication date: 7 May 2009
Published in: Operations Research Letters (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.orl.2008.12.003
Related Items (2)
Percentile queries in multi-dimensional Markov decision processes ⋮ Meet your expectations with guarantees: beyond worst-case synthesis in quantitative games
Cites Work
- Unnamed Item
- Unnamed Item
- Hoeffding's inequality for uniformly ergodic Markov chains
- Finite state Markovian decision processes
- Rate of Convergence of Empirical Measures and Costs in Controlled Markov Chains and Transient Optimality
- On the Empirical State-Action Frequencies in Markov Decision Processes Under General Policies
This page was built for publication: Fast convergence to state-action frequency polytopes for MDPs