scientific article; zbMATH DE number 6860778
From MaRDI portal
Publication:4636981
zbMath1434.68463MaRDI QIDQ4636981
Jan Peters, Gerhard Neumann, Herke van Hoof
Publication date: 17 April 2018
Full work available at URL: http://jmlr.csail.mit.edu/papers/v18/16-142.html
Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.
Nonparametric estimation (62G05) Learning and adaptive systems in artificial intelligence (68T05) Optimal stochastic control (93E20) Stochastic learning and adaptive control (93E35) Markov and semi-Markov decision processes (90C40)
Related Items (2)
Variational policy search using sparse Gaussian process priors for learning multimodal optimal actions ⋮ Unnamed Item
Uses Software
Cites Work
- Kernel methods in machine learning
- Approximate dynamic programming with a fuzzy parameterization
- Kernel-based reinforcement learning
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- Model-based contextual policy search for data-efficient generalization of robot skills
- Bias and Variance Approximation in Value Function Estimates
- Using Expectation-Maximization for Reinforcement Learning
- Algorithms for Reinforcement Learning
- 10.1162/1532443041827907
- Approximate Dynamic Programming
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
This page was built for publication: