Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part I: I.I.D. rewards
DOI10.1109/TAC.1987.1104491zbMath0632.93067MaRDI QIDQ3770415
V. Anantharam, Jean Walrand, Pravin P. Varaiya
Publication date: 1987
Published in: IEEE Transactions on Automatic Control (Search for Journal in Brave)
resource allocation problemsregret functionmultiple playsdenseness conditionMultiarmed bandit problems
Adaptive control/observation systems (93C40) Estimation and detection in stochastic control theory (93E10) Stochastic games, stochastic differential games (91A15) Stochastic systems in control theory (general) (93E03) Probabilistic games; gambling (91A60)
Related Items (13)
This page was built for publication: Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part I: I.I.D. rewards