Adaptive policy-iteration and policy-value-iteration for discounted Markov decision processes
From MaRDI portal
Publication:3984139
DOI10.1007/BF01415991zbMath0748.90076MaRDI QIDQ3984139
No author found.
Publication date: 27 June 1992
Published in: [https://portal.mardi4nfdi.de/entity/Q3031760 ZOR Zeitschrift f�r Operations Research Methods and Models of Operations Research] (Search for Journal in Brave)
nonstationary value iterationdiscounted Markov decision processpolicy-iterationasymptotically discount optimal policiespolicy-value-iteration
Related Items (1)
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- A unified approach to adaptive control of average reward Markov decision processes
- Nonstationary Markov decision problems with converging parameters
- Adaptive Markov control processes
- Adaptive Policies in Markov Decision Processes with Uncertain Transition Matrices
- Estimation and control in discounted stochastic dynamic programming
- Learning algorithms for Markov decision processes
- A set of successive approximation methods for discounted Markovian decision problems
- Modified Policy Iteration Algorithms for Discounted Markov Decision Problems
- Approximations of Dynamic Programs, I
- Estimation and control in Markov chains
This page was built for publication: Adaptive policy-iteration and policy-value-iteration for discounted Markov decision processes