Finite state multi-armed bandit problems: Sensitive-discount, average-reward and average-overtaking optimality
From MaRDI portal
Publication:2564701
DOI10.1214/aoap/1034968239zbMath0862.90127OpenAlexW2037155730MaRDI QIDQ2564701
Michael N. Katehakis, Uriel G. Rothblum
Publication date: 15 January 1997
Published in: The Annals of Applied Probability (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1214/aoap/1034968239
Sensitivity, stability, parametric optimization (90C31) Dynamic programming (90C39) Stopping times; optimal stopping problems; gambling theory (60G40) Markov and semi-Markov decision processes (90C40)
Related Items (6)
Optimistic Gittins Indices ⋮ Marginal productivity index policies for scheduling a multiclass delay-/loss-sensitive queue ⋮ Four proofs of Gittins' multiarmed bandit theorem ⋮ Optimal activation of halting multi‐armed bandit models ⋮ MULTI-ARMED BANDITS UNDER GENERAL DEPRECIATION AND COMMITMENT ⋮ Gittins Index for Simple Family of Markov Bandit Processes with Switching Cost and No Discounting
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Multi-armed bandits with discount factor near one: The Bernoulli case
- Finite state Markovian decision processes
- Procedures for the evaluation of strategies for resource allocation in a stochastic environment
- The Multi-Armed Bandit Problem: Decomposition and Computation
- Open bandit processes and optimal scheduling of queueing networks
- On the evaluation of suboptimal strategies for families of alternative bandit processes
- Bounds for discounted stochastic scheduling problems
- Discrete Dynamic Programming
- On Finding Optimal Policies in Discrete Dynamic Programming with No Discounting
- An Optimality Condition for Discrete Dynamic Programming with no Discounting
- Discrete Dynamic Programming with a Small Interest Rate
- Discrete Dynamic Programming with Sensitive Discount Optimality Criteria
This page was built for publication: Finite state multi-armed bandit problems: Sensitive-discount, average-reward and average-overtaking optimality