Pages that link to "Item:Q1604819"
From MaRDI portal
The following pages link to Technical update: Least-squares temporal difference learning (Q1604819):
Displaying 25 items.
- Batch mode reinforcement learning based on the synthesis of artificial trajectories (Q378762) (← links)
- The optimal unbiased value estimator and its relation to LSTD, TD and MC (Q415609) (← links)
- Asymptotic analysis of value prediction by well-specified and misspecified models (Q448322) (← links)
- A two-level optimization model for elective surgery scheduling with downstream capacity constraints (Q666974) (← links)
- Proximal algorithms and temporal difference methods for solving fixed point problems (Q721950) (← links)
- Solving factored MDPs using non-homogeneous partitions (Q814475) (← links)
- A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning (Q859737) (← links)
- Reinforcement learning algorithms with function approximation: recent advances and applications (Q903601) (← links)
- Projected equation methods for approximate solution of large linear systems (Q1012492) (← links)
- An online prediction algorithm for reinforcement learning with linear function approximation using cross entropy method (Q1631797) (← links)
- Linear least-squares algorithms for temporal difference learning (Q1911340) (← links)
- Restricted gradient-descent algorithm for value-function approximation in reinforcement learning (Q2389624) (← links)
- Basis function adaptation in temporal difference reinforcement learning (Q2485935) (← links)
- An approximate dynamic programming approach to the admission control of elective patients (Q2668708) (← links)
- Convergence of the standard RLS method and<b><i>UDU</i></b><sup><i>T</i></sup>factorisation of covariance matrix for solving the algebraic Riccati equation of the DLQR via heuristic approximate dynamic programming (Q2792939) (← links)
- Approximate optimal adaptive control for weakly coupled nonlinear systems: a neuro-inspired approach (Q2829514) (← links)
- Approximate policy iteration: a survey and some new methods (Q2887629) (← links)
- On Generalized Bellman Equations and Temporal-Difference Learning (Q3305109) (← links)
- (Q5168862) (← links)
- (Q5168869) (← links)
- Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning (Q5189863) (← links)
- Linear least-squares algorithms for temporal difference learning (Q5477859) (← links)
- New Versions of Gradient Temporal-Difference Learning (Q6093230) (← links)
- Deep reinforcement trading with predictable returns (Q6098411) (← links)
- Eligibility traces and forgetting factor in recursive least-squares-based temporal difference (Q6495643) (← links)