Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
On the Gittins index for multiarmed bandits - MaRDI portal

On the Gittins index for multiarmed bandits

From MaRDI portal

Publication:1203758

Jump to:navigation, search

DOI10.1214/aoap/1177005588zbMath0763.60021OpenAlexW1996859119WikidataQ55920221 ScholiaQ55920221MaRDI QIDQ1203758

Richard R. Weber

Publication date: 22 February 1993

Published in: The Annals of Applied Probability (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1214/aoap/1177005588

zbMATH Keywords

sequential methods Gittins index policy multiarmed bandit problem

Mathematics Subject Classification ID

Deterministic scheduling theory in operations research (90B35) Stopping times; optimal stopping problems; gambling theory (60G40) Markov and semi-Markov decision processes (90C40) Sequential statistical design (62L05)

Related Items (30)

Gambling Under Unknown Probabilities as Proxy for Real World Decisions Under Uncertainty ⋮ Multi-armed bandit problem revisited ⋮ Open Bandit Processes with Uncountable States and Time-Backward Effects ⋮ Optimistic Gittins Indices ⋮ Multi-armed bandit processes with optimal selection of the operating times ⋮ On Gittins' index theorem in continuous time ⋮ Four proofs of Gittins' multiarmed bandit theorem ⋮ Kullback-Leibler upper confidence bounds for optimal sequential allocation ⋮ The multi-armed bandit, with constraints ⋮ The archievable region method in the optimal control of queueing systems; formulations, bounds and policies ⋮ Optimal activation of halting multi‐armed bandit models ⋮ Index policy for multiarmed bandit problem with dynamic risk measures ⋮ MULTI-ARMED BANDITS UNDER GENERAL DEPRECIATION AND COMMITMENT ⋮ ON THE IDENTIFICATION AND MITIGATION OF WEAKNESSES IN THE KNOWLEDGE GRADIENT POLICY FOR MULTI-ARMED BANDITS ⋮ Empirical Gittins index strategies with \(\varepsilon\)-explorations for multi-armed bandit problems ⋮ Optimal Dynamic Information Acquisition ⋮ Dynamic priority allocation via restless bandit marginal productivity indices ⋮ Information-gain computation in the \textsc{Fifth} system ⋮ Stochastic scheduling: a short history of index policies and new approaches to index generation for dynamic resource allocation ⋮ Reading policies for joins: an asymptotic analysis ⋮ Stopped decision processes in conjunction with general utility ⋮ On the Gittins index in the M/G/1 queue ⋮ Unnamed Item ⋮ Independently Expiring Multiarmed Bandits ⋮ Survey of linear programming for standard and nonstandard Markovian control problems. Part II: Applications ⋮ Efficiency in lung transplant allocation strategies ⋮ Gittins' theorem under uncertainty ⋮ Technical Note—A Note on the Equivalence of Upper Confidence Bounds and Gittins Indices for Patient Agents ⋮ Multi-armed bandits in discrete and continuous time ⋮ Multi-armed bandit models for the optimal design of clinical trials: benefits and challenges

This page was built for publication: On the Gittins index for multiarmed bandits

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:1203758&oldid=13268365"