Training behavior of deep neural network in frequency domain
From MaRDI portal
Publication:6303821
arXiv1807.01251MaRDI QIDQ6303821
Author name not available (Why is that?)
Publication date: 3 July 2018
Abstract: Why deep neural networks (DNNs) capable of overfitting often generalize well in practice is a mystery [#zhang2016understanding]. To find a potential mechanism, we focus on the study of implicit biases underlying the training process of DNNs. In this work, for both real and synthetic datasets, we empirically find that a DNN with common settings first quickly captures the dominant low-frequency components, and then relatively slowly captures the high-frequency ones. We call this phenomenon Frequency Principle (F-Principle). The F-Principle can be observed over DNNs of various structures, activation functions, and training algorithms in our experiments. We also illustrate how the F-Principle help understand the effect of early-stopping as well as the generalization of DNNs. This F-Principle potentially provides insights into a general principle underlying DNN optimization and generalization.
Has companion code repository: https://github.com/xuzhiqin1990/F-Principle
No records found.
This page was built for publication: Training behavior of deep neural network in frequency domain
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6303821)