An Algorithm to Identify and Compute Average Optimal Policies in Multichain Markov Decision Processes
DOI10.1287/MOOR.28.3.553.16388zbMath1082.90129OpenAlexW2159692787WikidataQ114967779 ScholiaQ114967779MaRDI QIDQ5704141
Publication date: 11 November 2005
Published in: Mathematics of Operations Research (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1287/moor.28.3.553.16388
Markov decision processlong-run average costoptimal policiescompact action setscomputation algorithmapproximate optimal policiescommunication classes, finite state spacemultichain MDP
Markov chains (discrete-time Markov processes on discrete state spaces) (60J10) Applications of Markov chains and discrete-time Markov processes on general state spaces (social mobility, learning theory, industrial processes, etc.) (60J20) Markov and semi-Markov decision processes (90C40)
Related Items (1)
This page was built for publication: An Algorithm to Identify and Compute Average Optimal Policies in Multichain Markov Decision Processes