Index-based policies for discounted multi-armed bandits on parallel machines.
From MaRDI portal
Publication:1872472
DOI10.1214/AOAP/1019487512zbMath1073.90568OpenAlexW2042312241MaRDI QIDQ1872472
Darren J. Wilkinson, Kevin D. Glazebrook
Publication date: 6 May 2003
Published in: The Annals of Applied Probability (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1214/aoap/1019487512
parallel machinesGittins indexmulti-armed bandit problemAverage-overtaking optimalaverage-reward optimalsuboptimality bound
Minimax problems in mathematical programming (90C47) Stochastic scheduling theory in operations research (90B36) Markov and semi-Markov decision processes (90C40)
Related Items (1)
Cites Work
- Unnamed Item
- Almost optimal policies for stochastic systems which almost satisfy conservation laws
- A Characterization of Waiting Time Performance Realizable by Single-Server Queues
- Multiclass Queueing Systems: Polymatroidal Structure and Optimal Scheduling Control
- Conservation Laws, Extended Polymatroids and Multiarmed Bandit Problems; A Polyhedral Approach to Indexable Systems
- Discrete Dynamic Programming
- On Finding Optimal Policies in Discrete Dynamic Programming with No Discounting
- An Optimality Condition for Discrete Dynamic Programming with no Discounting
This page was built for publication: Index-based policies for discounted multi-armed bandits on parallel machines.