Pages that link to "Item:Q2485935"
From MaRDI portal
The following pages link to Basis function adaptation in temporal difference reinforcement learning (Q2485935):
Displaying 19 items.
- Model selection in reinforcement learning (Q415618) (← links)
- Approximate dynamic programming via direct search in the space of value function approximations (Q713118) (← links)
- Reinforcement learning algorithms with function approximation: recent advances and applications (Q903601) (← links)
- Projected equation methods for approximate solution of large linear systems (Q1012492) (← links)
- An incremental off-policy search in a model-free Markov decision process using a single sample path (Q1621868) (← links)
- An online prediction algorithm for reinforcement learning with linear function approximation using cross entropy method (Q1631797) (← links)
- Reliability of internal prediction/estimation and its application. I: Adaptive action selection reflecting reliability of value function (Q1886590) (← links)
- Reinforcement learning for a biped robot based on a CPG-actor-critic method (Q2383520) (← links)
- Restricted gradient-descent algorithm for value-function approximation in reinforcement learning (Q2389624) (← links)
- A tutorial on the cross-entropy method (Q2485925) (← links)
- Chaotic dynamics and convergence analysis of temporal difference algorithms with bang-bang control (Q2800471) (← links)
- Approximate policy iteration: a survey and some new methods (Q2887629) (← links)
- Learning Tetris Using the Noisy Cross-Entropy Method (Q3421374) (← links)
- An Incremental Fast Policy Search Using a Single Sample Path (Q5045345) (← links)
- (Q5214220) (← links)
- Two Time-Scale Stochastic Approximation with Controlled Markov Noise and Off-Policy Temporal-Difference Learning (Q5219302) (← links)
- Approximate dynamic programming via iterated Bellman inequalities (Q5256802) (← links)
- Actor-Critic Algorithms with Online Feature Adaptation (Q5270681) (← links)
- Least squares policy iteration with instrumental variables vs. direct policy search: comparison against optimal benchmarks using energy storage (Q5882386) (← links)