scientific article; zbMATH DE number 6253908
From MaRDI portal
Publication:5396640
zbMath1280.91039MaRDI QIDQ5396640
Publication date: 3 February 2014
Full work available at URL: http://www.jmlr.org/papers/v12/hazan11a.html
Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.
Decision theory (91B06) Learning and adaptive systems in artificial intelligence (68T05) Markov and semi-Markov decision processes (90C40) Probabilistic games; gambling (91A60)
Related Items (8)
Unnamed Item ⋮ Relaxing the i.i.d. assumption: adaptively minimax optimal regret via root-entropic regularization ⋮ Optimal Exploration–Exploitation in a Multi-armed Bandit Problem with Non-stationary Rewards ⋮ Extracting certainty from uncertainty: regret bounded by variation in costs ⋮ Truthful Mechanisms with Implicit Payment Computation ⋮ AN ONLINE PORTFOLIO SELECTION ALGORITHM WITH REGRET LOGARITHMIC IN PRICE VARIATION ⋮ Stochastic continuum-armed bandits with additive models: minimax regrets and adaptive algorithm ⋮ Doubly robust policy evaluation and optimization
This page was built for publication: