Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part I: I.I.D. rewards

Adaptive control/observation systems (93C40) Estimation and detection in stochastic control theory (93E10) Stochastic games, stochastic differential games (91A15) Stochastic systems in control theory (general) (93E03) Probabilistic games; gambling (91A60)

Related Items (13)

Multiplayer Bandits Without Observing Collision Information ⋮ Distributed cooperative decision making in multi-agent multi-armed bandits ⋮ A perpetual search for talents across overlapping generations: a learning process ⋮ Managing caching strategies for stream reasoning with reinforcement learning ⋮ Optimal strategies for a class of sequential control problems with precedence relations ⋮ Learning in Combinatorial Optimization: What and How to Explore ⋮ Unnamed Item ⋮ Arbitrary side observations in bandit problems ⋮ An online algorithm for the risk-aware restless bandit ⋮ Adaptive policies for perimeter surveillance problems ⋮ Certainty equivalence control with forcing: Revisited ⋮ Polynomial-Time Algorithms for Multiple-Arm Identification with Full-Bandit Feedback ⋮ Asymptotically optimal algorithms for budgeted multiple play bandits

This page was built for publication: Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part I: I.I.D. rewards