scientific article; zbMATH DE number 7626733
From MaRDI portal
Publication:5053221
Zixin Zhong, Wang Chueng Chi, Vincent Y. F. Tan
Publication date: 6 December 2022
Full work available at URL: https://arxiv.org/abs/1810.01187
Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.
Cites Work
- Unnamed Item
- Exploration-exploitation tradeoff using variance estimates in multi-armed bandits
- Linear Thompson sampling revisited
- Multi-objective multi-armed bandit with lexicographically ordered and satisficing objectives
- On Upper-Confidence Bound Policies for Switching Bandit Problems
- Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis
- Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part I: I.I.D. rewards
- A Tutorial on Thompson Sampling
- Near-Optimal Regret Bounds for Thompson Sampling
- The Nonstochastic Multiarmed Bandit Problem
- MNL-Bandit: A Dynamic Learning Approach to Assortment Selection
- Learning to Optimize via Posterior Sampling
- Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems
- Elements of Information Theory