Some Limit Properties of Markov Chains Induced by Recursive Stochastic Algorithms
DOI10.1137/19M1258104zbMath1485.93646arXiv1904.10778OpenAlexW3092741379MaRDI QIDQ5037552
Jianzong Pi, Gaurav Tendolkar, Hao Chen, Abhishek Gupta
Publication date: 1 March 2022
Published in: SIAM Journal on Mathematics of Data Science (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1904.10778
stochastic gradient descentFeller Markov chainsiterative random mapsconstant stepsize Q learningempirical dynamic programming
Dynamic programming (90C39) Applications of Markov chains and discrete-time Markov processes on general state spaces (social mobility, learning theory, industrial processes, etc.) (60J20) Stochastic learning and adaptive control (93E35) Markov and semi-Markov decision processes (90C40)
Uses Software
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Minimizing finite sums with the stochastic average gradient
- Simulation-based optimization of Markov decision processes: an empirical process theory approach
- Ergodicity and central limit theorems for a class of Markov processes
- Asynchronous stochastic approximation and Q-learning
- Convergence results for single-step on-policy reinforcement-learning algorithms
- Weak convergence and empirical processes. With applications to statistics
- Error bounds for constant step-size \(Q\)-learning
- Bridging the gap between constant step size stochastic gradient descent and Markov chains
- Empirical Dynamic Programming
- Concentration of Measure Inequalities in Information Theory, Communications, and Coding
- Approximate Iterative Algorithms
- Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming
- Adaptivity of averaged stochastic gradient descent to local strong convexity for logistic regression
- Real Analysis on Intervals
- The Strong Law of Large Numbers for a Class of Markov Chains
- Large-Scale Machine Learning with Stochastic Gradient Descent
- Markov Chains and Stochastic Stability
- Robust Stochastic Approximation Approach to Stochastic Programming
- Distributed asynchronous deterministic and stochastic gradient optimization algorithms
- Recursive Stochastic Algorithms for Global Optimization in $\mathbb{R}^d $
- Acceleration of Stochastic Approximation by Averaging
- Weak convergence of a sequence of Markov chains
- Approximations of Dynamic Programs, I
- Approximations of Dynamic Programs, II
- Iterated Random Functions
- On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
- CONVERGENCE OF SIMULATION-BASED POLICY ITERATION
- Stochastic Approximation for Nonexpansive Maps: Application to Q-Learning Algorithms
- Real Analysis and Applications
- Harder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression
- Metropolis-Type Annealing Algorithms for Global Optimization in $\mathbb{R}^d $
- Asymptotic Behavior of a Markovian Stochastic Algorithm with Constant Step
- Performance Guarantees for Empirical Markov Decision Processes with Applications to Multiperiod Inventory Models
- The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
- A Finite Time Analysis of Temporal Difference Learning with Linear Function Approximation
- Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies
- A Universal Empirical Dynamic Programming Algorithm for Continuous State MDPs
- Riemannian Stochastic Variance Reduced Gradient Algorithm with Retraction and Vector Transport
- Probability Inequalities for Sums of Bounded Random Variables
- Stochastic Gradient Descent on Riemannian Manifolds
- Empirical Q-Value Iteration
- Convergence of stochastic processes