A Two-Time-Scale Stochastic Optimization Framework with Applications in Control and Reinforcement Learning
From MaRDI portal
Publication:6195318
DOI10.1137/22m150277xarXiv2109.14756MaRDI QIDQ6195318
Justin Romberg, Thinh T. Doan, Unnamed Author
Publication date: 13 March 2024
Published in: SIAM Journal on Optimization (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/2109.14756
Nonconvex programming, global optimization (90C26) Stochastic programming (90C15) Markov and semi-Markov decision processes (90C40)
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Stochastic compositional gradient descent: algorithms for minimizing compositions of expected-value functions
- Backpropagation and stochastic gradient descent method
- Convergence rate and averaging of nonlinear two-time-scale stochastic approximation algo\-rithms
- Convergence rate of linear two-time-scale stochastic approximation.
- Linear least-squares algorithms for temporal difference learning
- Finite-sample analysis of nonlinear stochastic approximation with applications in reinforcement learning
- Convergence of a stochastic subgradient method with averaging for nonsmooth nonconvex constrained optimization
- An overview of bilevel optimization
- A Linearization Method for Nonsmooth Stochastic Programming Problems
- The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
- Learning Optimal Controllers for Linear Systems With Multiplicative Noise via Policy Gradient
- Two Time-Scale Stochastic Approximation with Controlled Markov Noise and Off-Policy Temporal-Difference Learning
- Solving Stochastic Compositional Optimization is Nearly as Easy as Solving Stochastic Optimization
- Finite-Time Convergence Rates of Decentralized Stochastic Approximation With Applications in Multi-Agent and Multi-Task Learning