Statistically Efficient, Polynomial Time Algorithms for Combinatorial Semi Bandits
From MaRDI portal
Publication:6335008
arXiv2002.07258MaRDI QIDQ6335008
Author name not available (Why is that?)
Publication date: 17 February 2020
Abstract: We consider combinatorial semi-bandits over a set of arms where rewards are uncorrelated across items. For this problem, the algorithm ESCB yields the smallest known regret bound , but it has computational complexity which is typically exponential in , and cannot be used in large dimensions. We propose the first algorithm which is both computationally and statistically efficient for this problem with regret and computational complexity . Our approach involves carefully designing an approximate version of ESCB with the same regret guarantees, showing that this approximate algorithm can be implemented in time by repeatedly maximizing a linear function over subject to a linear budget constraint, and showing how to solve this maximization problems efficiently.
Has companion code repository: https://github.com/dourouc05/CombinatorialBandits.jl
This page was built for publication: Statistically Efficient, Polynomial Time Algorithms for Combinatorial Semi Bandits
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6335008)