Pages that link to "Item:Q1870310"
From MaRDI portal
The following pages link to Least squares policy evaluation algorithms with linear function approximation (Q1870310):
Displaying 27 items.
- Potential-based least-squares policy iteration for a parameterized feedback control system (Q289143) (← links)
- Batch mode reinforcement learning based on the synthesis of artificial trajectories (Q378762) (← links)
- Temporal difference-based policy iteration for optimal control of stochastic systems (Q467477) (← links)
- Proximal algorithms and temporal difference methods for solving fixed point problems (Q721950) (← links)
- Reinforcement learning algorithms with function approximation: recent advances and applications (Q903601) (← links)
- A formal framework and extensions for function approximation in learning classifier systems (Q1009226) (← links)
- Projected equation methods for approximate solution of large linear systems (Q1012492) (← links)
- Technical update: Least-squares temporal difference learning (Q1604819) (← links)
- An online prediction algorithm for reinforcement learning with linear function approximation using cross entropy method (Q1631797) (← links)
- Real-time reinforcement learning by sequential actor-critics and experience replay (Q1784532) (← links)
- Hybrid least-squares algorithms for approximate policy evaluation (Q1959511) (← links)
- Transmission scheduling for multi-process multi-sensor remote estimation via approximate dynamic programming (Q2063834) (← links)
- Kernel dynamic policy programming: applicable reinforcement learning to robot systems with high dimensional states (Q2292214) (← links)
- Dynamic modeling and control of supply chain systems: A review (Q2483501) (← links)
- A note on linear function approximation using random projections (Q2519761) (← links)
- A concentration bound for \(\operatorname{LSPE}( \lambda )\) (Q2677709) (← links)
- Approximate policy iteration: a survey and some new methods (Q2887629) (← links)
- Variance Regularization in Sequential Bayesian Optimization (Q3387910) (← links)
- 10.1162/1532443041827907 (Q4826001) (← links)
- Finite-Time Performance of Distributed Temporal-Difference Learning with Linear Function Approximation (Q4999359) (← links)
- Allocating resources via price management systems: a dynamic programming-based approach (Q5018825) (← links)
- Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning (Q5060503) (← links)
- (Q5168862) (← links)
- (Q5405225) (← links)
- Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes (Q5898263) (← links)
- Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes (Q5920615) (← links)
- Least Squares Policy Iteration with Instrumental Variables vs. Direct Policy Search: Comparison Against Optimal Benchmarks Using Energy Storage (Q6247846) (← links)