Finite-Time Performance of Distributed Temporal-Difference Learning with Linear Function Approximation
From MaRDI portal
Publication:4999359
DOI10.1137/20M1311971zbMath1483.68294arXiv1907.12530OpenAlexW3132503596MaRDI QIDQ4999359
Siva Theja Maguluri, Justin Romberg, Thinh T. Doan
Publication date: 6 July 2021
Published in: SIAM Journal on Mathematics of Data Science (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1907.12530
Analysis of algorithms (68W40) Learning and adaptive systems in artificial intelligence (68T05) Markov and semi-Markov decision processes (90C40) Distributed algorithms (68W15) Agent technology and artificial intelligence (68T42)
Related Items
Multi-agent natural actor-critic reinforcement learning algorithms, Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis, Multi-agent reinforcement learning: a selective overview of theories and algorithms, Finite-Time Analysis and Restarting Scheme for Linear Two-Time-Scale Stochastic Approximation
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Average cost temporal-difference learning
- The convergence of \(TD(\lambda)\) for general \(\lambda\)
- Least squares policy evaluation algorithms with linear function approximation
- Linear least-squares algorithms for temporal difference learning
- Distributed Policy Evaluation Under Multiple Behavior Strategies
- Algorithms for Reinforcement Learning
- Matrix Analysis
- Markov Chains
- An analysis of temporal-difference learning with function approximation
- ${{\cal Q} {\cal D}}$-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through ${\rm Consensus} + {\rm Innovations}$
- The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
- Convergence Results for Some Temporal Difference Methods Based on Least Squares
- A Concentration Bound for Stochastic Approximation via Alekseev’s Formula
- Cooperative Control of Mobile Sensor Networks: Adaptive Gradient Climbing in a Distributed Environment
- Distributed Reinforcement Learning via Gossip