Pages that link to "Item:Q4785631"
From MaRDI portal
The following pages link to The Nonstochastic Multiarmed Bandit Problem (Q4785631):
Displaying 50 items.
- Robust control of the multi-armed bandit problem (Q2095215) (← links)
- MedleySolver: online SMT algorithm selection (Q2118336) (← links)
- Adaptive large neighborhood search for mixed integer programming (Q2146445) (← links)
- Dynamic pricing with finite price sets: a non-parametric approach (Q2238754) (← links)
- Filtered Poisson process bandit on a continuum (Q2239901) (← links)
- Tune and mix: learning to rank using ensembles of calibrated multi-class classifiers (Q2251439) (← links)
- BoostingTree: parallel selection of weak learners in boosting, with application to ranking (Q2251442) (← links)
- Mistake bounds on the noise-free multi-armed bandit game (Q2280334) (← links)
- New bounds on the price of bandit feedback for mistake-bounded online multiclass learning (Q2290693) (← links)
- Analysis of Hannan consistent selection for Monte Carlo tree search in simultaneous move games (Q2303656) (← links)
- A bad arm existence checking problem: how to utilize asymmetric problem structure? (Q2303673) (← links)
- On the stability of an adaptive learning dynamics in traffic games (Q2319665) (← links)
- Exponential weight approachability, applications to calibration and regret minimization (Q2342741) (← links)
- Improved second-order bounds for prediction with expert advice (Q2384131) (← links)
- Online calibrated forecasts: memory efficiency versus universality for learning in games (Q2384142) (← links)
- Global Nash convergence of Foster and Young's regret testing (Q2384434) (← links)
- Pure exploration in finitely-armed and continuous-armed bandits (Q2431430) (← links)
- Online linear optimization and adaptive routing (Q2462507) (← links)
- Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems (Q2479159) (← links)
- Multi-armed bandits based on a variant of simulated annealing (Q2520136) (← links)
- Mechanisms with learning for stochastic multi-armed bandit problems (Q2520139) (← links)
- Value functions for depth-limited solving in zero-sum imperfect-information games (Q2680767) (← links)
- Regret minimization in online Bayesian persuasion: handling adversarial receiver's types under full and partial feedback models (Q2680788) (← links)
- Multi-channel transmission scheduling with hopping scheme under uncertain channel states (Q2694160) (← links)
- Truthful mechanisms with implicit payment computation (Q2796397) (← links)
- On the Prior Sensitivity of Thompson Sampling (Q2831392) (← links)
- Online Learning in Markov Decision Processes with Continuous Actions (Q2835638) (← links)
- Close the gaps: a learning-while-doing algorithm for single-product revenue management problems (Q2875601) (← links)
- Achieving Unbounded Resolution in<i>Finite</i>Player Goore Games Using Stochastic Automata, and Its Applications (Q2888572) (← links)
- Playing in stochastic environment: from multi-armed bandits to two-player games (Q2908837) (← links)
- Discount Targeting in Online Social Networks Using Backpressure-Based Learning (Q2917229) (← links)
- Learning where to attend with deep architectures for image tracking (Q2919435) (← links)
- Chasing Ghosts: Competing with Stateful Policies (Q2968152) (← links)
- Agent-based Modeling and Simulation of Competitive Wholesale Electricity Markets (Q2974421) (← links)
- Computational Randomness from Generalized Hardcore Sets (Q3088271) (← links)
- The Irrevocable Multiarmed Bandit Problem (Q3098762) (← links)
- (Q3121140) (← links)
- On Learning Algorithms for Nash Equilibria (Q3162512) (← links)
- No Regret Learning in Oligopolies: Cournot vs. Bertrand (Q3162528) (← links)
- On Solving Finite State Multi-Armed Bandit Problem by Linear Programming (Q3360683) (← links)
- Reinforcement with Fading Memories (Q3387923) (← links)
- Bayesian Incentive-Compatible Bandit Exploration (Q3387959) (← links)
- Incentivizing Exploration with Heterogeneous Value of Money (Q3460803) (← links)
- Following the Perturbed Leader to Gamble at Multi-armed Bandits (Q3520057) (← links)
- A Simple Distribution-Free Approach to the Max k-Armed Bandit Problem (Q3524258) (← links)
- Online Regret Bounds for Markov Decision Processes with Deterministic Transitions (Q3529915) (← links)
- Workspace-Based Connectivity Oracle: An Adaptive Sampling Strategy for PRM Planning (Q3564291) (← links)
- Pure Exploration in Multi-armed Bandits Problems (Q3648740) (← links)
- (Q4558509) (← links)
- Nonstochastic Multi-Armed Bandits with Graph-Structured Feedback (Q4596721) (← links)