Continuous multi-armed bandits and multiparameter processes (Q1110966)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: Continuous multi-armed bandits and multiparameter processes |
scientific article; zbMATH DE number 4074262
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | Continuous multi-armed bandits and multiparameter processes |
scientific article; zbMATH DE number 4074262 |
Statements
Continuous multi-armed bandits and multiparameter processes (English)
0 references
1987
0 references
A general framework is proposed for continuous time dynamic allocation models of a scarce resource among competing projects. The allocation model is formulated as a multi-armed bandit model and solved as a control problem of a multiparameter process. In contrast to discrete time bandits, where only one arm can be pulled at a time, the continuous time bandit must allow simultaneous pulls. The multiparameter approach allows a strong solution of diffusion-type bandits. Here the main problem is to define precisely how to switch among arms and the solution involves local times.
0 references
dynamic allocation
0 references
Gittins' index
0 references
optional increasing path
0 references
continuous time dynamic allocation models
0 references
multi-armed bandit model
0 references
continuous time bandit
0 references
strong solution of diffusion-type bandits
0 references
local times
0 references