scientific article; zbMATH DE number 6982910
From MaRDI portal
Publication:4558474
zbMath1471.62441MaRDI QIDQ4558474
Junya Honda, Michael N. Katehakis, Wesley Cowan
Publication date: 22 November 2018
Full work available at URL: http://jmlr.csail.mit.edu/papers/v18/15-154.html
Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.
Related Items (4)
EXPLORATION–EXPLOITATION POLICIES WITH ALMOST SURE, ARBITRARILY SLOW GROWING ASYMPTOTIC REGRET ⋮ Optimal activation of halting multi‐armed bandit models ⋮ Unnamed Item ⋮ Adaptive policies for perimeter surveillance problems
Cites Work
- Kullback-Leibler upper confidence bounds for optimal sequential allocation
- Asymptotically optimal Bayesian sequential change detection and identification rules
- An asymptotically optimal policy for finite support models in the multiarmed bandit problem
- UCB revisited: improved regret bounds for the stochastic multi-armed bandit problem
- Exploration-exploitation tradeoff using variance estimates in multi-armed bandits
- Asymptotically efficient adaptive allocation rules
- On the Gittins index for multiarmed bandits
- Optimal adaptive policies for sequential allocation problems
- Online Learning of Rested and Restless Bandits
- Multi‐Armed Bandit Allocation Indices
- Optimal Adaptive Policies for Markov Decision Processes
- ASYMPTOTIC BAYES ANALYSIS FOR THE FINITE-HORIZON ONE-ARMED-BANDIT PROBLEM
- 10.1162/1532443041827907
- On large deviations properties of sequential allocation problems
- MULTI-ARMED BANDITS UNDER GENERAL DEPRECIATION AND COMMITMENT
- Some aspects of the sequential design of experiments
- Finite-time analysis of the multiarmed bandit problem
- Unnamed Item
- Unnamed Item
- Unnamed Item
This page was built for publication: