Robust Fine-Tuning of Deep Neural Networks with Hessian-based Generalization Guarantees

Author name not available (Why is that?)

Publication date: 6 June 2022

Abstract: We consider transfer learning approaches that fine-tune a pretrained deep neural network on a target task. We study the generalization properties of fine-tuning to understand the problem of overfitting, which commonly occurs in practice. Previous works have shown that constraining the distance from the initialization of fine-tuning improves generalization. Using a PAC-Bayesian analysis, we observe that besides distance from initialization, Hessians affect generalization through the noise stability of deep neural networks against noise injections. Motivated by the observation, we develop Hessian distance-based generalization bounds for a wide range of fine-tuning methods. Additionally, we study the robustness of fine-tuning in the presence of noisy labels. We design an algorithm incorporating consistent losses and distance-based regularization for fine-tuning, along with a generalization error guarantee under class conditional independent noise in the training set labels. We perform a detailed empirical study of our algorithm on various noisy environments and architectures. On six image classification tasks whose training labels are generated with programmatic labeling, we find a 3.26% accuracy gain over prior fine-tuning methods. Meanwhile, the Hessian distance measure of the fine-tuned model decreases by six times more than existing approaches.

Has companion code repository: https://github.com/neu-statsml-research/robust-fine-tuning

This page was built for publication: Robust Fine-Tuning of Deep Neural Networks with Hessian-based Generalization Guarantees

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6401266)