Concentration of Contractive Stochastic Approximation and Reinforcement Learning
From MaRDI portal
Publication:5870773
DOI10.1287/stsy.2022.0097OpenAlexW3173526354WikidataQ114058119 ScholiaQ114058119MaRDI QIDQ5870773
Siddharth Chandak, Parth Dodhia, Vivek S. Borkar
Publication date: 23 January 2023
Published in: Stochastic Systems (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/2106.14308
reinforcement learningconcentration boundscontractive stochastic approximationasynchronous Q-learningTD(0)
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Exponential inequalities for martingales and asymptotic properties of the free energy of directed polymers in a random environment
- Exact formula for sensitivity analysis of Markov chains
- Convergence results for single-step on-policy reinforcement-learning algorithms
- \({\mathcal Q}\)-learning
- Concentration bounds for temporal difference learning with linear function approximation: the case of batch data and uniform sampling
- A concentration bound for contractive stochastic approximation
- Simplified description of slow Markov walks. II
- On the Convergence, Lock-In Probability, and Sample Complexity of Stochastic Approximation
- An Invariant Measure Approach to the Convergence of Stochastic Approximations with State Dependent Noise
- Markov Chains and Stochastic Stability
- On the Lock-in Probability of Stochastic Approximation
- An analysis of temporal-difference learning with function approximation
- Stochastic Approximation for Nonexpansive Maps: Application to Q-Learning Algorithms
- Simulation-based optimization of Markov reward processes
- A Concentration Bound for Stochastic Approximation via Alekseev’s Formula
- Approximate Dynamic Programming
- Comparison of perturbation bounds for the stationary distribution of a Markov chain
This page was built for publication: Concentration of Contractive Stochastic Approximation and Reinforcement Learning