scientific article
From MaRDI portal
Publication:3093197
zbMath1222.68099MaRDI QIDQ3093197
Shie Mannor, John N. Tsitsiklis
Publication date: 12 October 2011
Full work available at URL: http://www.jmlr.org/papers/v5/mannor04b.html
Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.
Learning and adaptive systems in artificial intelligence (68T05) Markov and semi-Markov decision processes (90C40)
Related Items (23)
Best Arm Identification for Contaminated Bandits ⋮ Approximation algorithms for stochastic combinatorial optimization problems ⋮ Sequential estimation of quantiles with applications to A/B testing and best-arm identification ⋮ Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model ⋮ Unnamed Item ⋮ An instance-based algorithm for deciding the bias of a coin ⋮ A perpetual search for talents across overlapping generations: a learning process ⋮ Pure exploration in finitely-armed and continuous-armed bandits ⋮ Learning the distribution with largest mean: two bandit frameworks ⋮ The \(K\)-armed dueling bandits problem ⋮ Online Regret Bounds for Markov Decision Processes with Deterministic Transitions ⋮ Amplification and Derandomization without Slowdown ⋮ Tractable Sampling Strategies for Ordinal Optimization ⋮ Near-optimal PAC bounds for discounted MDPs ⋮ UCB revisited: improved regret bounds for the stochastic multi-armed bandit problem ⋮ Simple Bayesian Algorithms for Best-Arm Identification ⋮ Online regret bounds for Markov decision processes with deterministic transitions ⋮ Explore First, Exploit Next: The True Shape of Regret in Bandit Problems ⋮ Bayesian Incentive-Compatible Bandit Exploration ⋮ Pure Exploration in Multi-armed Bandits Problems ⋮ Multi-armed bandits with episode context ⋮ Nonasymptotic sequential tests for overlapping hypotheses applied to near-optimal arm identification in bandit models ⋮ Trading utility and uncertainty: applying the value of information to resolve the exploration-exploitation dilemma in reinforcement learning
This page was built for publication: