scientific article; zbMATH DE number 7370545
From MaRDI portal
Publication:4998901
Julian Zimmert, Yevgeny Seldin
Publication date: 9 July 2021
Full work available at URL: https://arxiv.org/abs/1807.07623
Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.
online learningTsallis entropystochasticmulti-armed banditsbanditsadversarialbest of both worldsI.I.D.online mirror descent
Related Items (3)
Online team formation under different synergies ⋮ Relaxing the i.i.d. assumption: adaptively minimax optimal regret via root-entropic regularization ⋮ Unnamed Item
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Kullback-Leibler upper confidence bounds for optimal sequential allocation
- A generalized online mirror descent with applications to classification and regression
- Asymptotically efficient adaptive allocation rules
- Possible generalization of Boltzmann-Gibbs statistics.
- Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis
- Online Learning and Online Convex Optimization
- The Nonstochastic Multiarmed Bandit Problem
- Stochastic bandits robust to adversarial corruptions
- Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems
- Prediction, Learning, and Games
- Elements of Information Theory
- Some aspects of the sequential design of experiments
- Finite-time analysis of the multiarmed bandit problem
This page was built for publication: