On the two-armed bandit problem with continuous time parameter and discounted rewards
From MaRDI portal
Publication:3786305
DOI10.1080/17442508808833495zbMath0643.90096OpenAlexW2120701727MaRDI QIDQ3786305
Publication date: 1988
Published in: Stochastics (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1080/17442508808833495
continuous-time two-armed banditexpected discounted rewardstationary optimal policyExplicit formulae
Related Items (4)
Average optimality in a Poissonian bandit with switching arms ⋮ Learning to disagree in a game of experimentation ⋮ Good signals gone bad: dynamic signalling with switched effort levels ⋮ On the two-armed bandit problem with non-observed Poissonian switching of arms.
Cites Work
This page was built for publication: On the two-armed bandit problem with continuous time parameter and discounted rewards