The multi-armed bandit problem: an efficient nonparametric solution
From MaRDI portal
Publication:2176624
DOI10.1214/19-AOS1809zbMath1442.62180arXiv1703.08285OpenAlexW3007054292MaRDI QIDQ2176624
Publication date: 5 May 2020
Published in: The Annals of Statistics (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1703.08285
Nonparametric tolerance and confidence regions (62G15) Sequential statistical design (62L05) Optimal stopping in statistics (62L15) Compound decision problems in statistical decision theory (62C25)
Related Items (2)
A non-parametric solution to the multi-armed bandit problem with covariates ⋮ Infinite Arms Bandit: Optimality via Confidence Bounds
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Kullback-Leibler upper confidence bounds for optimal sequential allocation
- Nonparametric bandit methods
- Asymptotically efficient adaptive allocation rules
- Adaptive treatment allocation and the multi-armed bandit problem
- Optimal learning and experimentation in bandit problems.
- Optimal adaptive policies for sequential allocation problems
- Asymptotically efficient adaptive allocation schemes for controlled Markov chains: finite parameter space
- A new approach to the design of reinforcement schemes for learning automata
- Optimal stopping and dynamic allocation
- Asymptotically Efficient Adaptive Choice of Control Laws inControlled Markov Chains
- Sample mean based index policies by O(log n) regret for the multi-armed bandit problem
- Some Remarks on the Two-Armed Bandit
- A Bernoulli Two-armed Bandit
- Finite-time analysis of the multiarmed bandit problem
This page was built for publication: The multi-armed bandit problem: an efficient nonparametric solution