Bounds for the quality and the number of steps in Bellman's value iteration algorithm
From MaRDI portal
Publication:1317533
DOI10.1007/BF01719454zbMath0808.90127OpenAlexW2088930631MaRDI QIDQ1317533
Publication date: 27 March 1995
Published in: OR Spektrum (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1007/bf01719454
infinite horizon\(\varepsilon\)-optimal policyfinite state spacediscounted Markovian decision processsub-optimal decisions
Cites Work
- Unnamed Item
- Unnamed Item
- A polynomial time bound for Howard's policy improvement algorithm
- Abschätzungen für Spektralwerte
- Bounds and good policies in stationary finite–stage Markovian decision problems
- On the Fixed Points of the Optimal Reward Operator in Stochastic Dynamic Programming with Discount Factor Greater than One
This page was built for publication: Bounds for the quality and the number of steps in Bellman's value iteration algorithm