From perturbation analysis to Markov decision processes and reinforcement learning
From MaRDI portal
Publication:1870309
DOI10.1023/A:1022188803039zbMath1031.93166MaRDI QIDQ1870309
Publication date: 11 May 2003
Published in: Discrete Event Dynamic Systems (Search for Journal in Brave)
Markov decision processesperturbation analysisreinforcement learningon-line algorithmsPoisson equationsperformance potentialsQ-learninggradient-based policy iterationTD(\(\lambda\))
Learning and adaptive systems in artificial intelligence (68T05) Perturbations in control/observation systems (93C73) Stochastic learning and adaptive control (93E35) Markov and semi-Markov decision processes (90C40)
Related Items (6)
Stochastic control via direct comparison ⋮ Performance optimization of queueing systems with perturbation realization ⋮ Policy iteration based feedback control ⋮ Continuous-time Markov decision processes with \(n\)th-bias optimality criteria ⋮ A unified approach to Markov decision problems and performance sensitivity analysis with discounted and average criteria: multichain cases ⋮ Error bounds of optimization algorithms for semi-Markov decision processes
This page was built for publication: From perturbation analysis to Markov decision processes and reinforcement learning