Optimal adaptive policies for sequential allocation problems
From MaRDI portal
Publication:1922542
DOI10.1006/aama.1996.0007zbMath0854.60032OpenAlexW1983962754WikidataQ55883744 ScholiaQ55883744MaRDI QIDQ1922542
Michael N. Katehakis, Apostolos N. Burnetas
Publication date: 22 January 1997
Published in: Advances in Applied Mathematics (Search for Journal in Brave)
Full work available at URL: https://semanticscholar.org/paper/2fa3f78bd544c4bdb7986b5dd9feda47492b1e34
Related Items (21)
A non-parametric solution to the multi-armed bandit problem with covariates ⋮ Response-adaptive designs for clinical trials: simultaneous learning from multiple patients ⋮ EXPLORATION–EXPLOITATION POLICIES WITH ALMOST SURE, ARBITRARILY SLOW GROWING ASYMPTOTIC REGRET ⋮ Kullback-Leibler upper confidence bounds for optimal sequential allocation ⋮ The multi-armed bandit problem: an efficient nonparametric solution ⋮ Adaptive aggregation for reinforcement learning in average reward Markov decision processes ⋮ On bidding for a fixed number of items in a sequence of auctions ⋮ Robustness of stochastic bandit policies ⋮ A perpetual search for talents across overlapping generations: a learning process ⋮ An asymptotically optimal policy for finite support models in the multiarmed bandit problem ⋮ MULTI-ARMED BANDITS UNDER GENERAL DEPRECIATION AND COMMITMENT ⋮ ASYMPTOTICALLY OPTIMAL MULTI-ARMED BANDIT POLICIES UNDER A COST CONSTRAINT ⋮ Learning the distribution with largest mean: two bandit frameworks ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Infinite Arms Bandit: Optimality via Confidence Bounds ⋮ Adaptive policies for perimeter surveillance problems ⋮ Explore First, Exploit Next: The True Shape of Regret in Bandit Problems ⋮ Asymptotically optimal algorithms for budgeted multiple play bandits ⋮ Robust control of the multi-armed bandit problem ⋮ Adaptive Policies for Sequential Sampling under Incomplete Information and a Cost Constraint
This page was built for publication: Optimal adaptive policies for sequential allocation problems