Error bounds for constant step-size \(Q\)-learning
From MaRDI portal
Publication:1932736
DOI10.1016/j.sysconle.2012.08.014zbMath1255.93129OpenAlexW1999254175MaRDI QIDQ1932736
Publication date: 21 January 2013
Published in: Systems \& Control Letters (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.sysconle.2012.08.014
Learning and adaptive systems in artificial intelligence (68T05) Applications of Markov chains and discrete-time Markov processes on general state spaces (social mobility, learning theory, industrial processes, etc.) (60J20) Stochastic systems in control theory (general) (93E03)
Related Items
Some Limit Properties of Markov Chains Induced by Recursive Stochastic Algorithms, A Discrete-Time Switching System Analysis of Q-Learning, Recent advances in reinforcement learning in finance, Settling the sample complexity of model-based offline reinforcement learning, Q-learning for continuous-time linear systems: A model-free infinite horizon optimal control approach, Finite-sample analysis of nonlinear stochastic approximation with applications in reinforcement learning, Convergence of Recursive Stochastic Algorithms Using Wasserstein Divergence
Uses Software
Cites Work
- Q-learning and policy iteration algorithms for stochastic shortest path problems
- Asynchronous stochastic approximation and Q-learning
- \({\mathcal Q}\)-learning
- Boundedness of iterates in \(Q\)-learning
- Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming
- On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
- The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
- Unnamed Item
- Unnamed Item
- Unnamed Item