Robust control of the multi-armed bandit problem
From MaRDI portal
Publication:2095215
DOI10.1007/s10479-015-1965-7zbMath1506.90268OpenAlexW3124603229MaRDI QIDQ2095215
Aparupa Das Gupta, Felipe Caro
Publication date: 9 November 2022
Published in: Annals of Operations Research (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1007/s10479-015-1965-7
Bellman equationproject selectionindex policiesmultiarmed banditrobust Markov decision processesuncertain transition matrix
Related Items (2)
Optimal Learning Under Robustness and Time-Consistency ⋮ Computation of weighted sums of rewards for concurrent MDPs
Cites Work
- Unnamed Item
- Robust decomposable Markov decision processes motivated by allocating school budgets
- Four proofs of Gittins' multiarmed bandit theorem
- The multi-armed bandit, with constraints
- Asymptotically efficient adaptive allocation rules
- Arm-acquiring bandits
- Algorithms for evaluating the dynamic allocation index
- Bounded-parameter Markov decision processes
- Lagrangian relaxation and constraint generation for allocation and advanced scheduling
- Optimal adaptive policies for sequential allocation problems
- On the optimality of the Gittins index rule for multi-armed bandits with multiple plays
- A dynamic programming approach to adjustable robust optimization
- Percentile Optimization for Markov Decision Processes with Parameter Uncertainty
- INDEXABILITY OF BANDIT PROBLEMS WITH RESPONSE DELAYS
- The Multi-Armed Bandit Problem: Decomposition and Computation
- Funding Criteria for Research, Development, and Exploration Projects
- Markov Decision Processes with Imprecise Transition Probabilities
- Optimal Adaptive Policies for Markov Decision Processes
- Markovian Decision Processes with Uncertain Transition Probabilities
- The Nonstochastic Multiarmed Bandit Problem
- Robust Markov Decision Processes
- Robust Control of Markov Decision Processes with Uncertain Transition Matrices
- MULTI-ARMED BANDITS UNDER GENERAL DEPRECIATION AND COMMITMENT
- Robust Dynamic Programming
This page was built for publication: Robust control of the multi-armed bandit problem