Adaptive Gradient Descent without Descent

From MaRDI portal
Publication:6327620

arXiv1910.09529MaRDI QIDQ6327620

Author name not available (Why is that?)

Publication date: 21 October 2019

Abstract: We present a strikingly simple proof that two rules are sufficient to automate gradient descent: 1) don't increase the stepsize too fast and 2) don't overstep the local curvature. No need for functional values, no line search, no information about the function except for the gradients. By following these rules, you get a method adaptive to the local geometry, with convergence guarantees depending only on the smoothness in a neighborhood of a solution. Given that the problem is convex, our method converges even if the global smoothness constant is infinity. As an illustration, it can minimize arbitrary continuously twice-differentiable convex function. We examine its performance on a range of convex and nonconvex problems, including logistic regression and matrix factorization.




Has companion code repository: https://github.com/ymalitsky/adaptive_gd








This page was built for publication: Adaptive Gradient Descent without Descent

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6327620)