Learning-Rate-Free Learning by D-Adaptation

From MaRDI portal
Publication:6423769

arXiv2301.07733MaRDI QIDQ6423769

Author name not available (Why is that?)

Publication date: 18 January 2023

Abstract: D-Adaptation is an approach to automatically setting the learning rate which asymptotically achieves the optimal rate of convergence for minimizing convex Lipschitz functions, with no back-tracking or line searches, and no additional function value or gradient evaluations per step. Our approach is the first hyper-parameter free method for this class without additional multiplicative log factors in the convergence rate. We present extensive experiments for SGD and Adam variants of our method, where the method automatically matches hand-tuned learning rates across more than a dozen diverse machine learning problems, including large-scale vision and language problems. An open-source implementation is available.




Has companion code repository: https://github.com/facebookresearch/dadaptation








This page was built for publication: Learning-Rate-Free Learning by D-Adaptation

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6423769)