Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
Improved regret for zeroth-order adversarial bandit convex optimisation - MaRDI portal

Improved regret for zeroth-order adversarial bandit convex optimisation (Q2035748)

From MaRDI portal
scientific article
Language Label Description Also known as
English
Improved regret for zeroth-order adversarial bandit convex optimisation
scientific article

    Statements

    Improved regret for zeroth-order adversarial bandit convex optimisation (English)
    0 references
    0 references
    25 June 2021
    0 references
    Summary: We prove that the information-theoretic upper bound on the minimax regret for zeroth-order adversarial bandit convex optimisation is at most \(O(d^{2.5} \sqrt{n} \log(n))\), where \(d\) is the dimension and \(n\) is the number of interactions. This improves on the bound of \(O(d^{9.5} \sqrt{n} \log(n)^{7.5})\) by \textit{S. Bubeck} et al. [in: Proceedings of the 49th annual ACM SIGACT symposium on theory of computing, STOC 2017. New York, NY: Association for Computing Machinery (ACM). 72--85 (2017; Zbl 1370.90175)]. The proof is based on identifying an improved exploratory distribution for convex functions.
    0 references
    bandit convex optimisation
    0 references
    online learning
    0 references

    Identifiers