Dismemberment and design for controlling the replication variance of regret for the multi-armed bandit
From MaRDI portal
Publication:2081727
DOI10.1007/s42519-022-00280-wOpenAlexW4289341771MaRDI QIDQ2081727
Timothy J. Keaton, Arman Sabbaghi
Publication date: 30 September 2022
Published in: Journal of Statistical Theory and Practice (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1007/s42519-022-00280-w
Mathematical programming (90Cxx) Mathematical economics (91Bxx) Sequential statistical methods (62Lxx)
Uses Software
Cites Work
- Exploration-exploitation tradeoff using variance estimates in multi-armed bandits
- Best subset, forward stepwise or Lasso? Analysis and recommendations based on extensive comparisons
- Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis
- Hannan Consistency in On-Line Learning in Case of Unbounded Losses Under Partial Monitoring
- Near-Optimal Regret Bounds for Thompson Sampling
- The Nonstochastic Multiarmed Bandit Problem
- Simple Bayesian Algorithms for Best-Arm Identification
- Finite-time analysis of the multiarmed bandit problem
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
This page was built for publication: Dismemberment and design for controlling the replication variance of regret for the multi-armed bandit