A projected primal-dual gradient optimal control method for deep reinforcement learning
DOI10.1186/s13362-020-00075-3zbMath1472.49042OpenAlexW3029445142MaRDI QIDQ1980960
Michael Burger, Simon Gottschalk, Matthias Gerdts
Publication date: 9 September 2021
Published in: Journal of Mathematics in Industry (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1186/s13362-020-00075-3
optimal controlnecessary optimality conditionsneural networksreinforcement learningMarkov Decision Process
Applications of Markov chains and discrete-time Markov processes on general state spaces (social mobility, learning theory, industrial processes, etc.) (60J20) Stochastic learning and adaptive control (93E35) Markov and semi-Markov decision processes (90C40) Optimality conditions for problems involving ordinary differential equations (49K15) Networks and circuits as models of computation; circuit complexity (68Q06)
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Ordinary differential equations. An introduction from the dynamical systems perspective
- Optimal control of ODEs and DAEs.
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- \({\mathcal Q}\)-learning
- Deep learning as optimal control problems: models and numerical methods
- Reinforcement Learning Applied to a Human Arm Model
- Handbook of Markov decision processes. Methods and applications
This page was built for publication: A projected primal-dual gradient optimal control method for deep reinforcement learning