Pages that link to "Item:Q2859220"
From MaRDI portal
The following pages link to An Efficient Algorithm for Learning with Semi-bandit Feedback (Q2859220):
Displaying 9 items.
- Combinatorial bandits (Q439986) (← links)
- An improved upper bound on the expected regret of UCB-type policies for a matching-selection bandit problem (Q1785430) (← links)
- Importance weighting without importance weights: an efficient algorithm for combinatorial semi-bandits (Q2834482) (← links)
- A Survey of Preference-Based Online Learning with Bandit Algorithms (Q2938721) (← links)
- (Q4558509) (← links)
- Online Learning over a Finite Action Set with Limited Switching (Q4991672) (← links)
- An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback (Q5361319) (← links)
- (Q5381125) (← links)
- (Q5744820) (← links)