Value Function Based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems
From MaRDI portal
Publication:6401879
arXiv2206.05976MaRDI QIDQ6401879
Author name not available (Why is that?)
Publication date: 13 June 2022
Abstract: Gradient-based optimization methods for hyperparameter tuning guarantee theoretical convergence to stationary solutions when for fixed upper-level variable values, the lower level of the bilevel program is strongly convex (LLSC) and smooth (LLS). This condition is not satisfied for bilevel programs arising from tuning hyperparameters in many machine learning algorithms. In this work, we develop a sequentially convergent Value Function based Difference-of-Convex Algorithm with inexactness (VF-iDCA). We show that this algorithm achieves stationary solutions without LLSC and LLS assumptions for bilevel programs from a broad class of hyperparameter tuning applications. Our extensive experiments confirm our theoretical findings and show that the proposed VF-iDCA yields superior performance when applied to tune hyperparameters.
Has companion code repository: https://github.com/sustech-optimization/vf-idca
This page was built for publication: Value Function Based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6401879)