Randomized allocation with nonparametric estimation for contextual multi-armed bandits with delayed rewards
From MaRDI portal
Publication:2006767
DOI10.1016/j.spl.2020.108818zbMath1456.62064arXiv1902.00819OpenAlexW3025710735MaRDI QIDQ2006767
Publication date: 12 October 2020
Published in: Statistics \& Probability Letters (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1902.00819
Nonparametric regression and quantile regression (62G08) Sequential statistical design (62L05) Sequential estimation (62L12)
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- The multi-armed bandit problem with covariates
- Asymptotically efficient adaptive allocation rules
- One-armed bandit problems with covariates
- Randomized allocation with nonparametric estimation for a multi-armed bandit problem with covariates
- On sequential decision problems with delayed observations
- A One-Armed Bandit Problem with a Concomitant Variable
- A Tutorial on Thompson Sampling
- Bandit Algorithms
- Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems
- Prediction, Learning, and Games
- Sequential Analysis with Delayed Observations
- Some aspects of the sequential design of experiments
- Finite-time analysis of the multiarmed bandit problem
- Randomized allocation with arm elimination in a bandit problem with covariates