Online Learning over a Finite Action Set with Limited Switching
From MaRDI portal
Publication:4991672
DOI10.1287/moor.2020.1052OpenAlexW3085291675MaRDI QIDQ4991672
Kunal Talwar, Jason M. Altschuler
Publication date: 3 June 2021
Published in: Mathematics of Operations Research (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1803.01548
online learningswitching costsminimax optimalityadversarial multiarmed banditshigh-probability algorithmsonline combinatorial optimizationprediction from expertsswitching budgets
Learning and adaptive systems in artificial intelligence (68T05) Reasoning under uncertainty in the context of artificial intelligence (68T37)
Uses Software
Cites Work
- Unnamed Item
- Unnamed Item
- Combinatorial bandits
- The weighted majority algorithm
- A decision-theoretic generalization of on-line learning and an application to boosting
- Predicting nearly as well as the best pruning of a planar decision graph.
- Online linear optimization and adaptive routing
- Efficient algorithms for online decision problems
- Concentration Inequalities
- An Efficient Algorithm for Learning with Semi-bandit Feedback
- Random-Walk Perturbations for Online Combinatorial Optimization
- Near-Optimal Rates for Limited-Delay Universal Lossy Source Coding
- Online Learning and Online Convex Optimization
- Online Markov Decision Processes
- Markov Decision Processes with Arbitrary Reward Processes
- Minimizing Regret With Label Efficient Prediction
- Dynamic huffman coding
- Self-adjusting binary search trees
- How to use expert advice
- The Nonstochastic Multiarmed Bandit Problem
- 10.1162/1532443041424328
- Regret in Online Combinatorial Optimization
- Bandits with switching costs
- Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems
- Prediction, Learning, and Games