Statistically Efficient, Polynomial Time Algorithms for Combinatorial Semi Bandits

From MaRDI portal
Publication:6335008

arXiv2002.07258MaRDI QIDQ6335008

Author name not available (Why is that?)

Publication date: 17 February 2020

Abstract: We consider combinatorial semi-bandits over a set of arms calXsubset0,1d where rewards are uncorrelated across items. For this problem, the algorithm ESCB yields the smallest known regret bound R(T)=calOBig(d(lnm)2(lnT)overDeltaminBig), but it has computational complexity calO(|calX|) which is typically exponential in d, and cannot be used in large dimensions. We propose the first algorithm which is both computationally and statistically efficient for this problem with regret R(T)=calOBig(d(lnm)2(lnT)overDeltaminBig) and computational complexity . Our approach involves carefully designing an approximate version of ESCB with the same regret guarantees, showing that this approximate algorithm can be implemented in time by repeatedly maximizing a linear function over calX subject to a linear budget constraint, and showing how to solve this maximization problems efficiently.




Has companion code repository: https://github.com/dourouc05/CombinatorialBandits.jl








This page was built for publication: Statistically Efficient, Polynomial Time Algorithms for Combinatorial Semi Bandits

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6335008)