Pages that link to "Item:Q2485935"

What links here

⧼whatlinkshere-whatlinkshere-target⧽

Page:

⧼whatlinkshere-whatlinkshere-ns⧽

Namespace:

Invert selection

⧼whatlinkshere-whatlinkshere-filter⧽

Hide transclusions

Hide links

Hide redirects

The following pages link to Basis function adaptation in temporal difference reinforcement learning (Q2485935):

Displaying 19 items.

Model selection in reinforcement learning (Q415618) (← links)
Approximate dynamic programming via direct search in the space of value function approximations (Q713118) (← links)
Reinforcement learning algorithms with function approximation: recent advances and applications (Q903601) (← links)
Projected equation methods for approximate solution of large linear systems (Q1012492) (← links)
An incremental off-policy search in a model-free Markov decision process using a single sample path (Q1621868) (← links)
An online prediction algorithm for reinforcement learning with linear function approximation using cross entropy method (Q1631797) (← links)
Reliability of internal prediction/estimation and its application. I: Adaptive action selection reflecting reliability of value function (Q1886590) (← links)
Reinforcement learning for a biped robot based on a CPG-actor-critic method (Q2383520) (← links)
Restricted gradient-descent algorithm for value-function approximation in reinforcement learning (Q2389624) (← links)
A tutorial on the cross-entropy method (Q2485925) (← links)
Chaotic dynamics and convergence analysis of temporal difference algorithms with bang-bang control (Q2800471) (← links)
Approximate policy iteration: a survey and some new methods (Q2887629) (← links)
Learning Tetris Using the Noisy Cross-Entropy Method (Q3421374) (← links)
An Incremental Fast Policy Search Using a Single Sample Path (Q5045345) (← links)
(Q5214220) (← links)
Two Time-Scale Stochastic Approximation with Controlled Markov Noise and Off-Policy Temporal-Difference Learning (Q5219302) (← links)
Approximate dynamic programming via iterated Bellman inequalities (Q5256802) (← links)
Actor-Critic Algorithms with Online Feature Adaptation (Q5270681) (← links)
Least squares policy iteration with instrumental variables vs. direct policy search: comparison against optimal benchmarks using energy storage (Q5882386) (← links)