Empirical Q-Value Iteration
From MaRDI portal
Publication:5856670
DOI10.1287/stsy.2019.0062zbMath1461.68184arXiv1412.0180OpenAlexW3092476885MaRDI QIDQ5856670
Dileep Kalathil, Rahul Jain, Vivek S. Borkar
Publication date: 29 March 2021
Published in: Stochastic Systems (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1412.0180
Learning and adaptive systems in artificial intelligence (68T05) Dynamic programming (90C39) Stochastic approximation (62L20) Markov and semi-Markov decision processes (90C40)
Related Items (1)
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Asynchronous stochastic approximation and Q-learning
- \({\mathcal Q}\)-learning
- Learning Algorithms for Markov Decision Processes with Average Cost
- Empirical Dynamic Programming
- Algorithms for Reinforcement Learning
- Merging of Opinions with Increasing Information
- Iterated Random Functions
- On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
- Stochastic Approximation for Nonexpansive Maps: Application to Q-Learning Algorithms
- Exact sampling with coupled Markov chains and applications to statistical mechanics
- Actor-Critic--Type Learning Algorithms for Markov Decision Processes
- On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems
- Approximate Dynamic Programming
This page was built for publication: Empirical Q-Value Iteration