Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
scientific article - MaRDI portal

scientific article

From MaRDI portal

Publication:3093261

Jump to:navigation, search

zbMath1222.68193MaRDI QIDQ3093261

Pierre Geurts, Damien Ernst, Louis Wehenkel

Publication date: 12 October 2011

Full work available at URL: http://www.jmlr.org/papers/v6/ernst05a.html

Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.

zbMATH Keywords

optimal control supervised learning regression trees ensemble methods batch mode reinforcement learning fitted value iteration

Mathematics Subject Classification ID

Classification and discrimination; cluster analysis (statistical aspects) (62H30) Learning and adaptive systems in artificial intelligence (68T05)

Related Items

Reinforcement Learning Strategies for Clinical Trials in Nonsmall Cell Lung Cancer, A unified DC programming framework and efficient DCA based approaches for large scale batch reinforcement learning, Extreme state aggregation beyond Markov decision processes, Bandit Theory: Applications to Learning Healthcare Systems and Clinical Trials, Making friends on the fly: cooperating with new teammates, Scalable transfer learning in heterogeneous, dynamic environments, Efficient approximate dynamic programming based on design and analysis of computer experiments for infinite-horizon optimization, Batch mode reinforcement learning based on the synthesis of artificial trajectories, Data-driven switching modeling for MPC using regression trees and random forests, Optimized ensemble value function approximation for dynamic programming, Model selection in reinforcement learning, Tutorial on Amortized Optimization, Target Network and Truncation Overcome the Deadly Triad in \(\boldsymbol{Q}\)-Learning, Approximated multi-agent fitted Q iteration, Reinforcement learning algorithms with function approximation: recent advances and applications, Evolving interpretable decision trees for reinforcement learning, Abstraction from demonstration for efficient reinforcement learning in high-dimensional domains, Epoch-incremental reinforcement learning algorithms, Towards Min Max Generalization in Reinforcement Learning, Extremely randomized trees, Unnamed Item, Quadratic approximate dynamic programming for input‐affine systems, Learning output reference model tracking for higher-order nonlinear systems with unknown dynamics, Extremely randomized trees, Hessian matrix distribution for Bayesian policy gradient reinforcement learning, Approximate dynamic programming with a fuzzy parameterization, Recovery of simultaneous low rank and two-way sparse coefficient matrices, a nonconvex approach, Fitted Q-iteration by functional networks for control problems, Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path, Regularized feature selection in reinforcement learning, Estimating optimal shared-parameter dynamic regimens with application to a multistage depression clinical trial, Bounds for Multistage Stochastic Programs Using Supervised Learning Strategies, The QLBS Q-Learner goes NuQLear: fitted Q iteration, inverse RL, and option portfolios, A deep reinforcement learning framework for continuous intraday market bidding, Challenges of real-world reinforcement learning: definitions, benchmarks and analysis, Multi-agent reinforcement learning: a selective overview of theories and algorithms, Learning When-to-Treat Policies, Batch policy learning in average reward Markov decision processes

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:3093261&oldid=16183712"