Pages that link to "Item:Q1604819"

What links here

⧼whatlinkshere-whatlinkshere-target⧽

Page:

⧼whatlinkshere-whatlinkshere-ns⧽

Namespace:

Invert selection

⧼whatlinkshere-whatlinkshere-filter⧽

Hide transclusions

Hide links

Hide redirects

The following pages link to Technical update: Least-squares temporal difference learning (Q1604819):

Displaying 25 items.

Batch mode reinforcement learning based on the synthesis of artificial trajectories (Q378762) (← links)
The optimal unbiased value estimator and its relation to LSTD, TD and MC (Q415609) (← links)
Asymptotic analysis of value prediction by well-specified and misspecified models (Q448322) (← links)
A two-level optimization model for elective surgery scheduling with downstream capacity constraints (Q666974) (← links)
Proximal algorithms and temporal difference methods for solving fixed point problems (Q721950) (← links)
Solving factored MDPs using non-homogeneous partitions (Q814475) (← links)
A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning (Q859737) (← links)
Reinforcement learning algorithms with function approximation: recent advances and applications (Q903601) (← links)
Projected equation methods for approximate solution of large linear systems (Q1012492) (← links)
An online prediction algorithm for reinforcement learning with linear function approximation using cross entropy method (Q1631797) (← links)
Linear least-squares algorithms for temporal difference learning (Q1911340) (← links)
Restricted gradient-descent algorithm for value-function approximation in reinforcement learning (Q2389624) (← links)
Basis function adaptation in temporal difference reinforcement learning (Q2485935) (← links)
An approximate dynamic programming approach to the admission control of elective patients (Q2668708) (← links)
Convergence of the standard RLS method andUDUTfactorisation of covariance matrix for solving the algebraic Riccati equation of the DLQR via heuristic approximate dynamic programming (Q2792939) (← links)
Approximate optimal adaptive control for weakly coupled nonlinear systems: a neuro-inspired approach (Q2829514) (← links)
Approximate policy iteration: a survey and some new methods (Q2887629) (← links)
On Generalized Bellman Equations and Temporal-Difference Learning (Q3305109) (← links)
(Q5168862) (← links)
(Q5168869) (← links)
Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning (Q5189863) (← links)
Linear least-squares algorithms for temporal difference learning (Q5477859) (← links)
New Versions of Gradient Temporal-Difference Learning (Q6093230) (← links)
Deep reinforcement trading with predictable returns (Q6098411) (← links)
Eligibility traces and forgetting factor in recursive least-squares-based temporal difference (Q6495643) (← links)