Two-armed bandit problem and batch version of the mirror descent algorithm
From MaRDI portal
Publication:2081125
DOI10.1134/S0005117922080100zbMath1498.93699MaRDI QIDQ2081125
Alex V. Kolnogorov, D. N. Shiyan, Alexander Nazin
Publication date: 12 October 2022
Published in: Automation and Remote Control (Search for Journal in Brave)
Applications of game theory (91A80) Adaptive control/observation systems (93C40) Stochastic systems in control theory (general) (93E03)
Cites Work
- Asymptotically efficient adaptive allocation rules
- On Bayesian index policies for sequential resource allocation
- Gaussian two-armed bandit and optimization of batch data processing
- Gaussian two-armed bandit: limiting description
- On the efficiency of a randomized mirror descent algorithm in online optimization problems
- An Asymptotic Minimax Theorem for the Two Armed Bandit Problem
- Sequential medical trials
- 10.1162/153244303321897663
- Bandit Algorithms
- Prediction, Learning, and Games
- Some Remarks on the Two-Armed Bandit
- Some aspects of the sequential design of experiments
- Finite-time analysis of the multiarmed bandit problem
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
This page was built for publication: Two-armed bandit problem and batch version of the mirror descent algorithm