An Efficient Algorithm for Learning with Semi-bandit Feedback
From MaRDI portal
Publication:2859220
DOI10.1007/978-3-642-40935-6_17zbMath1406.68099arXiv1305.2732OpenAlexW1561097743MaRDI QIDQ2859220
Publication date: 6 November 2013
Published in: Lecture Notes in Computer Science (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1305.2732
Learning and adaptive systems in artificial intelligence (68T05) Combinatorial optimization (90C27) General considerations in statistical decision theory (62C05)
Related Items (4)
Unnamed Item ⋮ An improved upper bound on the expected regret of UCB-type policies for a matching-selection bandit problem ⋮ Unnamed Item ⋮ Online Learning over a Finite Action Set with Limited Switching
This page was built for publication: An Efficient Algorithm for Learning with Semi-bandit Feedback