scientific article
From MaRDI portal
Publication:2810758
zbMath1360.62433arXiv1407.4443MaRDI QIDQ2810758
Emilie Kaufmann, Olivier Cappé, Aurélien Garivier
Publication date: 6 June 2016
Full work available at URL: https://arxiv.org/abs/1407.4443
Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.
multi-armed banditsequential testingpure explorationbest-arm identificationinformationtheoretic divergences
Analysis of algorithms and problem complexity (68Q25) Learning and adaptive systems in artificial intelligence (68T05) Image processing (compression, reconstruction, etc.) in information and communication theory (94A08) Sequential statistical analysis (62L10) Probabilistic games; gambling (91A60)
Related Items (27)
Best Arm Identification for Contaminated Bandits ⋮ Approximation algorithms for stochastic combinatorial optimization problems ⋮ Sequential estimation of quantiles with applications to A/B testing and best-arm identification ⋮ Robust Learning of Consumer Preferences ⋮ An index-based deterministic convergent optimal algorithm for constrained multi-armed bandit problems ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Good arm identification via bandit feedback ⋮ Treatment recommendation with distributional targets ⋮ Learning the distribution with largest mean: two bandit frameworks ⋮ A unified framework for stochastic optimization ⋮ Fano's inequality for random variables ⋮ Simple Bayesian Algorithms for Best-Arm Identification ⋮ Unnamed Item ⋮ Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization ⋮ Active ranking from pairwise comparisons and when parametric assumptions do not help ⋮ Time-uniform, nonparametric, nonasymptotic confidence sequences ⋮ Sequential controlled sensing for composite multihypothesis testing ⋮ Explore First, Exploit Next: The True Shape of Regret in Bandit Problems ⋮ A bad arm existence checking problem: how to utilize asymmetric problem structure? ⋮ Polynomial-Time Algorithms for Multiple-Arm Identification with Full-Bandit Feedback ⋮ A PAC algorithm in relative precision for bandit problem with costly sampling ⋮ Choosing the best arm with guaranteed confidence ⋮ The pure exploration problem with general reward functions depending on full distributions ⋮ On the Bias, Risk, and Consistency of Sample Means in Multi-armed Bandits ⋮ Satisficing in Time-Sensitive Bandit Learning
This page was built for publication: