Combinatorial bandits
From MaRDI portal
Publication:439986
DOI10.1016/j.jcss.2012.01.001zbMath1262.91052OpenAlexW2914156981WikidataQ59538560 ScholiaQ59538560MaRDI QIDQ439986
Gábor Lugosi, Nicolò Cesa-Bianchi
Publication date: 17 August 2012
Published in: Journal of Computer and System Sciences (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.jcss.2012.01.001
Related Items
Bounded Regret for Finitely Parameterized Multi-Armed Bandits, Bandit online optimization over the permutahedron, Continuous Assortment Optimization with Logit Choice Probabilities and Incomplete Information, Combining initial segments of lists, Online learning of network bottlenecks via minimax paths, Multi-armed bandits with censored consumption of resources, Variable Selection Via Thompson Sampling, Online team formation under different synergies, Nonstochastic Multi-Armed Bandits with Graph-Structured Feedback, Unnamed Item, Online learning of energy consumption for navigation of electric vehicles, A combinatorial multi-armed bandit approach to correlation clustering, Multi-channel transmission scheduling with hopping scheme under uncertain channel states, Per-Round Knapsack-Constrained Linear Submodular Bandits, Learning in Combinatorial Optimization: What and How to Explore, An improved upper bound on the expected regret of UCB-type policies for a matching-selection bandit problem, Sequential Shortest Path Interdiction with Incomplete Information, Adaptive policies for perimeter surveillance problems, Polynomial-Time Algorithms for Multiple-Arm Identification with Full-Bandit Feedback, A Combinatorial Metrical Task System Problem Under the Uniform Metric, Online Learning over a Finite Action Set with Limited Switching, Learning Unknown Service Rates in Queues: A Multiarmed Bandit Approach, Asymptotically optimal algorithms for budgeted multiple play bandits, Nested-Batch-Mode Learning and Stochastic Optimization with An Application to Sequential MultiStage Testing in Materials Science
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Local characteristics, entropy and limit theorems for spanning trees and domino tilings via transfer-impedances
- Efficient algorithms for online decision problems
- Probability on Trees and Networks
- A polynomial-time approximation algorithm for the permanent of a matrix with nonnegative entries
- Polynomial-Time Approximation Algorithms for the Ising Model
- Adaptive routing with end-to-end feedback
- Robbing the bandit
- How to Get a Perfectly Random Sample from a Generic Markov Chain and Generate a Random Spanning Tree of a Directed Graph
- Learning Theory
- The Nonstochastic Multiarmed Bandit Problem
- 10.1162/1532443041424328
- Learning Permutations with Exponential Weights
- Prediction, Learning, and Games