ASYMPTOTICALLY OPTIMAL MULTI-ARMED BANDIT POLICIES UNDER A COST CONSTRAINT
From MaRDI portal
Publication:5358116
DOI10.1017/S026996481600036XzbMath1373.62040arXiv1509.02857OpenAlexW2963946176MaRDI QIDQ5358116
Michael N. Katehakis, Odysseas Kanavetas, Apostolos N. Burnetas
Publication date: 19 September 2017
Published in: Probability in the Engineering and Informational Sciences (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1509.02857
Decision theory (91B06) Stochastic programming (90C15) Compound decision problems in statistical decision theory (62C25)
Related Items (1)
Cites Work
- Kullback-Leibler upper confidence bounds for optimal sequential allocation
- Asymptotically optimal Bayesian sequential change detection and identification rules
- An asymptotically optimal policy for finite support models in the multiarmed bandit problem
- Applications of mathematics and informatics in military science. Based on a conference, Hellenic Army Academy, Greece, April 2011
- UCB revisited: improved regret bounds for the stochastic multi-armed bandit problem
- Exploration-exploitation tradeoff using variance estimates in multi-armed bandits
- Asymptotically efficient adaptive allocation rules
- Optimal adaptive policies for sequential allocation problems
- Close the Gaps: A Learning-While-Doing Algorithm for Single-Product Revenue Management Problems
- Multi‐Armed Bandit Allocation Indices
- The Multi-Armed Bandit Problem: Decomposition and Computation
- Dynamic allocation policies for the finite horizon one armed bandit problem
- Optimal Adaptive Policies for Markov Decision Processes
- Bandits with Knapsacks
- ASYMPTOTIC BAYES ANALYSIS FOR THE FINITE-HORIZON ONE-ARMED-BANDIT PROBLEM
- 10.1162/1532443041827907
- On large deviations properties of sequential allocation problems
- MULTI-ARMED BANDITS UNDER GENERAL DEPRECIATION AND COMMITMENT
- Finite-time analysis of the multiarmed bandit problem
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
This page was built for publication: ASYMPTOTICALLY OPTIMAL MULTI-ARMED BANDIT POLICIES UNDER A COST CONSTRAINT