Surprises in high-dimensional ridgeless least squares interpolation

DOI10.1214/21-AOS2133zbMath1486.62202arXiv1903.08560OpenAlexW2923764619MaRDI QIDQ2131262

Publication date: 25 April 2022

Published in: The Annals of Statistics (Search for Journal in Brave)

Full work available at URL: https://arxiv.org/abs/1903.08560

zbMATH Keywords

interpolation ridge regression overparametrization random matrix theory regression

Mathematics Subject Classification ID

Asymptotic properties of parametric estimators (62F12) Ridge regression; shrinkage estimators (Lasso) (62J07) Linear regression; mixed models (62J05) Random matrices (probabilistic aspects) (60B20) General nonlinear regression (62J02) Learning and adaptive systems in artificial intelligence (68T05)

Related Items

Mehler’s Formula, Branching Process, and Compositional Kernels of Deep Neural Networks ⋮ Double Double Descent: On Generalization Errors in Transfer Learning between Linear Regression Tasks ⋮ Deep learning: a statistical viewpoint ⋮ Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation ⋮ Learning curves of generic features maps for realistic datasets with a teacher-student model* ⋮ Generalization error rates in kernel regression: the crossover from the noiseless to noisy regime* ⋮ On the proliferation of support vectors in high dimensions* ⋮ A precise high-dimensional asymptotic theory for boosting and minimum-\(\ell_1\)-norm interpolated classifiers ⋮ Benign overfitting in linear regression ⋮ Trading Signals in VIX Futures ⋮ Overparameterization and Generalization Error: Weighted Trigonometric Interpolation ⋮ Benefit of Interpolation in Nearest Neighbor Algorithms ⋮ Ridge-type linear shrinkage estimation of the mean matrix of a high-dimensional normal distribution ⋮ HARFE: hard-ridge random feature expansion ⋮ High dimensional binary classification under label shift: phase transition and regularization ⋮ Large-dimensional random matrix theory and its applications in deep learning and wireless communications ⋮ On the Inconsistency of Kernel Ridgeless Regression in Fixed Dimensions ⋮ Cross-Trait Prediction Accuracy of Summary Statistics in Genome-Wide Association Studies ⋮ Free dynamics of feature learning processes ⋮ A Universal Trade-off Between the Model Size, Test Loss, and Training Loss of Linear Predictors ⋮ Mini-workshop: Mathematical foundations of robust and generalizable learning. Abstracts from the mini-workshop held October 2--8, 2022 ⋮ Bayesian Conjugacy in Probit, Tobit, Multinomial Probit and Extensions: A Review and New Results ⋮ Smoothly varying regularization ⋮ Random neural networks in the infinite width limit as Gaussian processes ⋮ Stability of the scattering transform for deformations with minimal regularity ⋮ Universality of approximate message passing with semirandom matrices ⋮ High-Dimensional Analysis of Double Descent for Linear Regression with Random Projections ⋮ A Generalization Gap Estimation for Overparameterized Models via the Langevin Functional Variance ⋮ Universality of regularized regression estimators in high dimensions ⋮ Training-conditional coverage for distribution-free predictive inference ⋮ Benign Overfitting and Noisy Features ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Dimension independent excess risk by stochastic gradient descent ⋮ Precise statistical analysis of classification accuracies for adversarial training ⋮ On the robustness of minimum norm interpolators and regularized empirical risk minimizers ⋮ AdaBoost and robust one-bit compressed sensing ⋮ A Unifying Tutorial on Approximate Message Passing ⋮ A phase transition for finding needles in nonlinear haystacks with LASSO artificial neural networks ⋮ The interpolation phase transition in neural networks: memorization and generalization under lazy training ⋮ Prediction, Estimation, and Attribution ⋮ For interpolating kernel machines, minimizing the norm of the ERM solution maximizes stability ⋮ Prediction errors for penalized regressions based on generalized approximate message passing

Uses Software

PDCO

Cites Work