Actor-Critic Algorithms with Online Feature Adaptation
From MaRDI portal
Publication:5270681
DOI10.1145/2868723zbMath1369.90190OpenAlexW2256989395MaRDI QIDQ5270681
Vivek S. Borkar, K. J. Prabuchandran, Shalabh Bhatnagar
Publication date: 30 June 2017
Published in: ACM Transactions on Modeling and Computer Simulation (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1145/2868723
Markov decision processesstochastic approximationonline learningGrassmann manifoldtemporal difference learningactor-critic algorithmsfunction approximationSPSApolicy gradientsfeature adaptationresidual gradient scheme
Related Items (1)
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Stochastic recursive algorithms for optimization. Simultaneous perturbation methods
- Stochastic approximation. A dynamical systems viewpoint.
- Natural actor-critic algorithms
- Stochastic approximation methods for constrained and unconstrained systems
- Stochastic approximation with two time scales
- Average cost temporal-difference learning
- Convergence rate of linear two-time-scale stochastic approximation.
- Basis function adaptation in temporal difference reinforcement learning
- Multivariate stochastic approximation using a simultaneous perturbation gradient approximation
- The Geometry of Algorithms with Orthogonality Constraints
- An analysis of temporal-difference learning with function approximation
- OnActor-Critic Algorithms
- Simulation-based optimization of Markov reward processes
- 10.1162/1532443041827934
- 10.1162/1532443041827907
This page was built for publication: Actor-Critic Algorithms with Online Feature Adaptation