Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
Quasi-hyperbolic momentum and Adam for deep learning - MaRDI portal

Quasi-hyperbolic momentum and Adam for deep learning

From MaRDI portal
Publication:71639

DOI10.48550/ARXIV.1810.06801arXiv1810.06801MaRDI QIDQ71639

Denis Yarats, Jerry Ma

Publication date: 16 October 2018

Abstract: Momentum-based acceleration of stochastic gradient descent (SGD) is widely used in deep learning. We propose the quasi-hyperbolic momentum algorithm (QHM) as an extremely simple alteration of momentum SGD, averaging a plain SGD step with a momentum step. We describe numerous connections to and identities with other algorithms, and we characterize the set of two-state optimization algorithms that QHM can recover. Finally, we propose a QH variant of Adam called QHAdam, and we empirically demonstrate that our algorithms lead to significantly improved training in a variety of settings, including a new state-of-the-art result on WMT16 EN-DE. We hope that these empirical results, combined with the conceptual and practical simplicity of QHM and QHAdam, will spur interest from both practitioners and researchers. Code is immediately available.







Related Items (1)






This page was built for publication: Quasi-hyperbolic momentum and Adam for deep learning

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q71639)