Regret and Convergence Bounds for a Class of Continuum-Armed Bandit Problems
From MaRDI portal
Publication:4974598
DOI10.1109/TAC.2009.2019797zbMath1367.93738MaRDI QIDQ4974598
Publication date: 8 August 2017
Published in: IEEE Transactions on Automatic Control (Search for Journal in Brave)
Related Items (12)
Continuous Assortment Optimization with Logit Choice Probabilities and Incomplete Information ⋮ Adaptive-treed bandits ⋮ Technical note: <scp>Finite‐time</scp> regret analysis of <scp>Kiefer‐Wolfowitz</scp> stochastic approximation algorithm and nonparametric <scp>multi‐product</scp> dynamic pricing with unknown demand ⋮ Coordinating Pricing and Inventory Replenishment with Nonparametric Demand Learning ⋮ A pricing problem with unknown arrival rate and price sensitivity ⋮ Infinite Arms Bandit: Optimality via Confidence Bounds ⋮ Non-Stationary Stochastic Optimization ⋮ A revision game of experimentation on a common threshold ⋮ Dynamic Pricing with Multiple Products and Partially Specified Demand Distribution ⋮ Stochastic continuum-armed bandits with additive models: minimax regrets and adaptive algorithm ⋮ Optimal Policy for Dynamic Assortment Planning Under Multinomial Logit Models ⋮ On two continuum armed bandit problems in high dimensions
This page was built for publication: Regret and Convergence Bounds for a Class of Continuum-Armed Bandit Problems