Random Function Descent
From MaRDI portal
Publication:6435032
arXiv2305.01377MaRDI QIDQ6435032
Author name not available (Why is that?)
Publication date: 2 May 2023
Abstract: While gradient based methods are ubiquitous in machine learning, selecting the right step size often requires "hyperparameter tuning". This is because backtracking procedures like Armijo's rule depend on quality evaluations in every step, which are not available in a stochastic context. Since optimization schemes can be motivated using Taylor approximations, we replace the Taylor approximation with the conditional expectation (the best estimator) and propose "Random Function Descent" (RFD). Under light assumptions common in Bayesian optimization, we prove that RFD is identical to gradient descent, but with calculable step sizes, even in a stochastic context. We beat untuned Adam in synthetic benchmarks. To close the performance gap to tuned Adam, we propose a heuristic extension competitive with tuned Adam.
Has companion code repository: https://github.com/FelixBenning/pyrfd
This page was built for publication: Random Function Descent
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6435032)