Nonstochastic bandits: Countable decision set, unbounded costs and reactive environments
From MaRDI portal
Publication:924170
DOI10.1016/j.tcs.2008.02.024zbMath1145.68026OpenAlexW1994989479MaRDI QIDQ924170
Publication date: 28 May 2008
Published in: Theoretical Computer Science (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.tcs.2008.02.024
multi-armed banditpartial observationscountable decision setnonstochastic banditreactive environments
Computational learning theory (68Q32) Learning and adaptive systems in artificial intelligence (68T05) Rationality and learning in game theory (91A26) Multistage and repeated games (91A20) Probabilistic games; gambling (91A60)
Uses Software
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- The weighted majority algorithm
- A decision-theoretic generalization of on-line learning and an application to boosting
- Anytime algorithms for multi-armed bandit problems
- Complexity-based induction systems: Comparisons and convergence theorems
- Learning Theory
- Learning Theory
- The Nonstochastic Multiarmed Bandit Problem
- Regret Minimization Under Partial Monitoring
- Algorithmic Learning Theory
- Algorithmic Learning Theory
- Learning Theory
- Stochastic Algorithms: Foundations and Applications
This page was built for publication: Nonstochastic bandits: Countable decision set, unbounded costs and reactive environments