Pages that link to "Item:Q5959973"

From MaRDI portal

← Finite-time analysis of the multiarmed bandit problem (Q5959973)

Jump to:navigation, search

What links here

⧼whatlinkshere-whatlinkshere-target⧽

Page:

⧼whatlinkshere-whatlinkshere-ns⧽

Namespace:

Invert selection

⧼whatlinkshere-whatlinkshere-filter⧽

Hide transclusions

Hide links

Hide redirects

The following pages link to Finite-time analysis of the multiarmed bandit problem (Q5959973):

Displaying 50 items.

Bayesian policy reuse (Q1689554) (← links)
A quality assuring, cost optimal multi-armed bandit mechanism for expertsourcing (Q1690964) (← links)
A comparison of Monte Carlo tree search and rolling horizon optimization for large-scale dynamic resource allocation problems (Q1694949) (← links)
An optimal bidimensional multi-armed bandit auction for multi-unit procurement (Q1714944) (← links)
A Monte Carlo tree search approach to finding efficient patrolling schemes on graphs (Q1735189) (← links)
Markov decision processes with sequential sensor measurements (Q1737870) (← links)
Anytime discovery of a diverse set of patterns with Monte Carlo tree search (Q1741383) (← links)
On Bayesian index policies for sequential resource allocation (Q1750289) (← links)
A methodology for determining an effective subset of heuristics in selection hyper-heuristics (Q1753518) (← links)
Comparison of Kriging-based algorithms for simulation optimization with heterogeneous noise (Q1753578) (← links)
A hybrid breakout local search and reinforcement learning approach to the vertex separator problem (Q1753627) (← links)
Reinforcement learning with immediate rewards and linear hypotheses (Q1762980) (← links)
An improved upper bound on the expected regret of UCB-type policies for a matching-selection bandit problem (Q1785430) (← links)
On the probability of correct selection in ordinal comparison over dynamic networks (Q1935307) (← links)
Preference-based reinforcement learning: a formal framework and a policy iteration algorithm (Q1945130) (← links)
Regret bounds for sleeping experts and bandits (Q1959599) (← links)
Batch repair actions for automated troubleshooting (Q1989400) (← links)
Latest stored information based adaptive selection strategy for multiobjective evolutionary algorithm (Q1992994) (← links)
Randomized allocation with nonparametric estimation for contextual multi-armed bandits with delayed rewards (Q2006767) (← links)
Efficient crowdsourcing of unknown experts using bounded multi-armed bandits (Q2014933) (← links)
An online algorithm for the risk-aware restless bandit (Q2029383) (← links)
Rollout sampling approximate policy iteration (Q2036256) (← links)
Neural precedence recommender (Q2055885) (← links)
Enhancing gene expression programming based on space partition and jump for symbolic regression (Q2056306) (← links)
A revised approach for risk-averse multi-armed bandits under CVaR criterion (Q2060576) (← links)
Regret lower bound and optimal algorithm for high-dimensional contextual linear bandit (Q2074307) (← links)
Nonparametric Bayesian multiarmed bandits for single-cell experiment design (Q2078782) (← links)
Two-armed bandit problem and batch version of the mirror descent algorithm (Q2081125) (← links)
Dismemberment and design for controlling the replication variance of regret for the multi-armed bandit (Q2081727) (← links)
Stochastic continuum-armed bandits with additive models: minimax regrets and adaptive algorithm (Q2091834) (← links)
Multi-agent reinforcement learning: a selective overview of theories and algorithms (Q2094040) (← links)
Fairness in learning-based sequential decision algorithms: a survey (Q2094049) (← links)
Trading utility and uncertainty: applying the value of information to resolve the exploration-exploitation dilemma in reinforcement learning (Q2094051) (← links)
The pure exploration problem with general reward functions depending on full distributions (Q2102381) (← links)
Exploring search space trees using an adapted version of Monte Carlo tree search for combinatorial optimization problems (Q2108169) (← links)
Online machine learning algorithms to optimize performances of complex wireless communication systems (Q2130281) (← links)
PAC-Bayesian lifelong learning for multi-armed bandits (Q2134066) (← links)
Adaptive large neighborhood search for mixed integer programming (Q2146445) (← links)
Learning to steer nonlinear interior-point methods (Q2175369) (← links)
The multi-armed bandit problem: an efficient nonparametric solution (Q2176624) (← links)
A reliability-aware multi-armed bandit approach to learn and select users in demand response (Q2207171) (← links)
A pricing problem with unknown arrival rate and price sensitivity (Q2216175) (← links)
Ballooning multi-armed bandits (Q2238588) (← links)
Dynamic pricing with finite price sets: a non-parametric approach (Q2238754) (← links)
Machine learning at the service of meta-heuristics for solving combinatorial optimization problems: a state-of-the-art (Q2242290) (← links)
BoostingTree: parallel selection of weak learners in boosting, with application to ranking (Q2251442) (← links)
Effective deadlock resolution with self-interested partially-rational agents (Q2254627) (← links)
A multi-objective Monte Carlo tree search for forest harvest scheduling (Q2286904) (← links)
Adaptive policies for perimeter surveillance problems (Q2286935) (← links)
A survey of network interdiction models and algorithms (Q2294622) (← links)

Retrieved from "https://mardi.schubotz.org/wiki/Special:WhatLinksHere/Item:Q5959973"