Following the Perturbed Leader to Gamble at Multi-armed Bandits
From MaRDI portal
Publication:3520057
DOI10.1007/978-3-540-75225-7_16zbMath1142.68398OpenAlexW1568674531MaRDI QIDQ3520057
Publication date: 19 August 2008
Published in: Lecture Notes in Computer Science (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1007/978-3-540-75225-7_16
Computational learning theory (68Q32) Stopping times; optimal stopping problems; gambling theory (60G40)
Uses Software
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- The weighted majority algorithm
- A decision-theoretic generalization of on-line learning and an application to boosting
- Adaptive routing with end-to-end feedback
- Robbing the bandit
- Learning Theory
- The Nonstochastic Multiarmed Bandit Problem
- Learning Theory and Kernel Machines
- Algorithmic Learning Theory
- Prediction, Learning, and Games
- Learning Theory
- Some aspects of the sequential design of experiments
This page was built for publication: Following the Perturbed Leader to Gamble at Multi-armed Bandits