Pages that link to "Item:Q1812929"
From MaRDI portal
The following pages link to Practical issues in temporal difference learning (Q1812929):
Displaying 40 items.
- Approximate dynamic programming via direct search in the space of value function approximations (Q713118) (← links)
- Deep learning of support vector machines with class probability output networks (Q890735) (← links)
- A theoretical analysis of temporal difference learning in the iterated prisoner's dilemma game (Q1048261) (← links)
- Learning metric-topological maps for indoor mobile robot navigation (Q1128610) (← links)
- Model-based average reward reinforcement learning (Q1128769) (← links)
- Games, computers, and artificial intelligence (Q1603559) (← links)
- Linear least-squares algorithms for temporal difference learning (Q1911340) (← links)
- Feature-based methods for large scale dynamic programming (Q1911341) (← links)
- Reinforcement learning with replacing eligibility traces (Q1911343) (← links)
- The loss from imperfect value functions in exceptation-based and minimax-based tasks (Q1911345) (← links)
- Preference-based reinforcement learning: a formal framework and a policy iteration algorithm (Q1945130) (← links)
- Dynamic programming and value-function approximation in sequential decision problems: error analysis and numerical results (Q1949593) (← links)
- A reinforcement learning approach for dynamic multi-objective optimization (Q2055564) (← links)
- Deep reinforcement learning for inventory control: a roadmap (Q2076812) (← links)
- Learning long-term chess strategies from databases (Q2433173) (← links)
- Approximate dynamic programming for stochastic \(N\)-stage optimization with application to optimal consumption under uncertainty (Q2450902) (← links)
- Reinforcement learning of non-Markov decision processes (Q2675282) (← links)
- An approximate dynamic programming method for multi-input multi-output nonlinear system (Q2857151) (← links)
- Approximate policy iteration: a survey and some new methods (Q2887629) (← links)
- A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications (Q2887630) (← links)
- PLAYER CO-MODELLING IN A STRATEGY BOARD GAME: DISCOVERING HOW TO PLAY FAST (Q3393464) (← links)
- Hyperbolically Discounted Temporal Difference Learning (Q3568377) (← links)
- TD(λ) learning without eligibility traces: a theoretical analysis (Q4421245) (← links)
- Two-agent IDA* (Q4421276) (← links)
- Artificial Intelligence and Soft Computing - ICAISC 2004 (Q4666254) (← links)
- Many-Layered Learning (Q4781930) (← links)
- Bayesian Exploration for Approximate Dynamic Programming (Q4971589) (← links)
- Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis (Q5162625) (← links)
- An Empirical Overview of the No Free Lunch Theorem and Its Effect on Real-World Machine Learning Classification (Q5380387) (← links)
- Two steps reinforcement learning (Q5450297) (← links)
- SOLVING DYNAMIC WILDLIFE RESOURCE OPTIMIZATION PROBLEMS USING REINFORCEMENT LEARNING (Q5697240) (← links)
- Cooperation of categorical and behavioral learning in a practical solution to the abstraction problem (Q5945161) (← links)
- A tutorial survey of reinforcement learning (Q5955768) (← links)
- Programming backgammon using self-teaching neural nets (Q5958206) (← links)
- Computer Go: An AI oriented survey (Q5958709) (← links)
- New Versions of Gradient Temporal-Difference Learning (Q6093230) (← links)
- Optimal liquidation through a limit order book: a neural network and simulation approach (Q6164829) (← links)
- A brief survey on nonlinear control using adaptive dynamic programming under engineering-oriented complexities (Q6170807) (← links)
- Error controlled actor-critic (Q6205028) (← links)
- Eligibility traces and forgetting factor in recursive least-squares-based temporal difference (Q6495643) (← links)