scientific article
From MaRDI portal
Publication:3093261
zbMath1222.68193MaRDI QIDQ3093261
Pierre Geurts, Damien Ernst, Louis Wehenkel
Publication date: 12 October 2011
Full work available at URL: http://www.jmlr.org/papers/v6/ernst05a.html
Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.
optimal controlsupervised learningregression treesensemble methodsbatch mode reinforcement learningfitted value iteration
Classification and discrimination; cluster analysis (statistical aspects) (62H30) Learning and adaptive systems in artificial intelligence (68T05)
Related Items
Reinforcement Learning Strategies for Clinical Trials in Nonsmall Cell Lung Cancer, A unified DC programming framework and efficient DCA based approaches for large scale batch reinforcement learning, Extreme state aggregation beyond Markov decision processes, Bandit Theory: Applications to Learning Healthcare Systems and Clinical Trials, Making friends on the fly: cooperating with new teammates, Scalable transfer learning in heterogeneous, dynamic environments, Efficient approximate dynamic programming based on design and analysis of computer experiments for infinite-horizon optimization, Batch mode reinforcement learning based on the synthesis of artificial trajectories, Data-driven switching modeling for MPC using regression trees and random forests, Optimized ensemble value function approximation for dynamic programming, Model selection in reinforcement learning, Tutorial on Amortized Optimization, Target Network and Truncation Overcome the Deadly Triad in \(\boldsymbol{Q}\)-Learning, Approximated multi-agent fitted Q iteration, Reinforcement learning algorithms with function approximation: recent advances and applications, Evolving interpretable decision trees for reinforcement learning, Abstraction from demonstration for efficient reinforcement learning in high-dimensional domains, Epoch-incremental reinforcement learning algorithms, Towards Min Max Generalization in Reinforcement Learning, Extremely randomized trees, Unnamed Item, Quadratic approximate dynamic programming for input‐affine systems, Learning output reference model tracking for higher-order nonlinear systems with unknown dynamics, Extremely randomized trees, Hessian matrix distribution for Bayesian policy gradient reinforcement learning, Approximate dynamic programming with a fuzzy parameterization, Recovery of simultaneous low rank and two-way sparse coefficient matrices, a nonconvex approach, Fitted Q-iteration by functional networks for control problems, Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path, Regularized feature selection in reinforcement learning, Estimating optimal shared-parameter dynamic regimens with application to a multistage depression clinical trial, Bounds for Multistage Stochastic Programs Using Supervised Learning Strategies, The QLBS Q-Learner goes NuQLear: fitted Q iteration, inverse RL, and option portfolios, A deep reinforcement learning framework for continuous intraday market bidding, Challenges of real-world reinforcement learning: definitions, benchmarks and analysis, Multi-agent reinforcement learning: a selective overview of theories and algorithms, Learning When-to-Treat Policies, Batch policy learning in average reward Markov decision processes