Scaling description of generalization with number of parameters in deep learning
From MaRDI portal
Publication:5856249
DOI10.1088/1742-5468/ab633czbMath1459.82250arXiv1901.01608OpenAlexW2907047316MaRDI QIDQ5856249
Matthieu Wyart, Levent Sagun, Franck Gabriel, Clément Hongler, A. Jacot, Stefano Spigler, Stéphane D'Ascoli, Mario Geiger, Giulio Biroli
Publication date: 25 March 2021
Published in: Journal of Statistical Mechanics: Theory and Experiment (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1901.01608
Learning and adaptive systems in artificial intelligence (68T05) Neural nets applied to problems in time-dependent statistical mechanics (82C32)
Related Items (17)
Double Double Descent: On Generalization Errors in Transfer Learning between Linear Regression Tasks ⋮ Surprises in high-dimensional ridgeless least squares interpolation ⋮ The inverse variance–flatness relation in stochastic gradient descent is critical for finding flat minima ⋮ Overparameterization and Generalization Error: Weighted Trigonometric Interpolation ⋮ Large-dimensional random matrix theory and its applications in deep learning and wireless communications ⋮ Free dynamics of feature learning processes ⋮ High-Dimensional Analysis of Double Descent for Linear Regression with Random Projections ⋮ Harmonic analysis of network systems via kernels and their boundary realizations ⋮ A Generalization Gap Estimation for Overparameterized Models via the Langevin Functional Variance ⋮ Normalization effects on deep neural networks ⋮ Landscape and training regimes in deep learning ⋮ Geometric compression of invariant manifolds in neural networks ⋮ Normalization effects on shallow neural networks and related asymptotic expansions ⋮ Unnamed Item ⋮ A phase transition for finding needles in nonlinear haystacks with LASSO artificial neural networks ⋮ Triple descent and the two kinds of overfitting: where and why do they appear?* ⋮ An analytic theory of shallow networks dynamics for hinge loss classification*
Uses Software
Cites Work
This page was built for publication: Scaling description of generalization with number of parameters in deep learning