A stability criterion for two timescale stochastic approximation schemes
From MaRDI portal
Publication:2409333
DOI10.1016/j.automatica.2016.12.014zbMath1371.93208OpenAlexW2184204218MaRDI QIDQ2409333
Shalabh Bhatnagar, Chandrashekar Lakshminarayanan
Publication date: 11 October 2017
Published in: Automatica (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.automatica.2016.12.014
simulationreinforcement learningtwo-timescale stochastic approximationlimiting ODEstability of iterates
Time-scale analysis and singular perturbations in control/observation systems (93C70) Identification in stochastic control theory (93E12) Stochastic learning and adaptive control (93E35)
Related Items
Sequential online subsampling for thinning experimental designs, Convergence of stochastic approximation via martingale and converse Lyapunov methods, Whittle index based Q-learning for restless bandits with average reward
Cites Work
- Natural actor-critic algorithms
- Stochastic approximation with two time scales
- Average cost temporal-difference learning
- Two-timescale simultaneous perturbation stochastic approximation using deterministic perturbation sequences
- Adaptive multivariate three-timescale stochastic approximation algorithms for simulation based optimization
- The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
- A two Timescale Stochastic Approximation Scheme for Simulation-Based Parametric Optimization
- Perturbation theory and finite Markov chains
- Unnamed Item
- Unnamed Item