An online actor-critic algorithm with function approximation for constrained Markov decision processes
From MaRDI portal
Publication:438776
DOI10.1007/s10957-012-9989-5zbMath1262.90189OpenAlexW2073314543MaRDI QIDQ438776
Publication date: 31 July 2012
Published in: Journal of Optimization Theory and Applications (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1007/s10957-012-9989-5
function approximationactor critic algorithmconstrained Markov decision processlong-run average cost criterion
Markov chains (discrete-time Markov processes on discrete state spaces) (60J10) Markov and semi-Markov decision processes (90C40)
Related Items
Event-based optimization approach for solving stochastic decision problems with probabilistic constraint, Multiscale Q-learning with linear function approximation, Queueing Network Controls via Deep Reinforcement Learning, Suboptimal control for nonlinear systems with disturbance via integral sliding mode control and policy iteration, Variance-constrained actor-critic algorithms for discounted and average reward MDPs, Optimal deterministic controller synthesis from steady-state distributions, Risk-Constrained Reinforcement Learning with Percentile Risk Criteria
Cites Work
- The Borkar-Meyn theorem for asynchronous stochastic approximations
- An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes
- Natural actor-critic algorithms
- Average cost temporal-difference learning
- An actor-critic algorithm for constrained Markov decision processes
- Optimal flow control of a class of queueing networks in equilibrium
- Multivariate stochastic approximation using a simultaneous perturbation gradient approximation
- Asynchronous Stochastic Approximations
- OnActor-Critic Algorithms
- Simulation-based optimization of Markov reward processes
- The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
- Perturbation theory and finite Markov chains
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item