An Online Policy Gradient Algorithm for Markov Decision Processes with Continuous States and Actions
From MaRDI portal
Publication:5380403
DOI10.1162/NECO_a_00808zbMath1472.68149OpenAlexW2225522132WikidataQ47600318 ScholiaQ47600318MaRDI QIDQ5380403
Kohei Hatano, Yao Ma, Tingting Zhao, Masashi Sugiyama
Publication date: 4 June 2019
Published in: Neural Computation (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1162/neco_a_00808
Learning and adaptive systems in artificial intelligence (68T05) Markov and semi-Markov decision processes (90C40) Online algorithms; streaming algorithms (68W27)
Related Items (1)
An Online Policy Gradient Algorithm for Markov Decision Processes with Continuous States and Actions
Cites Work
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- Efficient algorithms for online decision problems
- Online Markov Decision Processes Under Bandit Feedback
- Online Markov Decision Processes
- Markov Decision Processes with Arbitrary Reward Processes
- Logarithmic Regret Algorithms for Online Convex Optimization
- An Online Policy Gradient Algorithm for Markov Decision Processes with Continuous States and Actions
- Unnamed Item
- Unnamed Item
This page was built for publication: An Online Policy Gradient Algorithm for Markov Decision Processes with Continuous States and Actions