Mathematical Research Data Initiative
Main page
Recent changes
Random page
Help about MediaWiki
Create a new Item
Create a new Property
Create a new EntitySchema
Merge two items
In other projects
Discussion
View source
View history
Purge
English
Log in

From perturbation analysis to Markov decision processes and reinforcement learning

From MaRDI portal
Publication:1870309
Jump to:navigation, search

DOI10.1023/A:1022188803039zbMath1031.93166MaRDI QIDQ1870309

Xi-Ren Cao

Publication date: 11 May 2003

Published in: Discrete Event Dynamic Systems (Search for Journal in Brave)


zbMATH Keywords

Markov decision processesperturbation analysisreinforcement learningon-line algorithmsPoisson equationsperformance potentialsQ-learninggradient-based policy iterationTD(\(\lambda\))


Mathematics Subject Classification ID

Learning and adaptive systems in artificial intelligence (68T05) Perturbations in control/observation systems (93C73) Stochastic learning and adaptive control (93E35) Markov and semi-Markov decision processes (90C40)


Related Items (6)

Stochastic control via direct comparison ⋮ Performance optimization of queueing systems with perturbation realization ⋮ Policy iteration based feedback control ⋮ Continuous-time Markov decision processes with \(n\)th-bias optimality criteria ⋮ A unified approach to Markov decision problems and performance sensitivity analysis with discounted and average criteria: multichain cases ⋮ Error bounds of optimization algorithms for semi-Markov decision processes







This page was built for publication: From perturbation analysis to Markov decision processes and reinforcement learning

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:1870309&oldid=14262588"
Tools
What links here
Related changes
Special pages
Printable version
Permanent link
Page information
MaRDI portal item
This page was last edited on 1 February 2024, at 11:43.
Privacy policy
About MaRDI portal
Disclaimers
Imprint
Powered by MediaWiki