Surprises in high-dimensional ridgeless least squares interpolation
From MaRDI portal
Publication:2131262
DOI10.1214/21-AOS2133zbMath1486.62202arXiv1903.08560OpenAlexW2923764619MaRDI QIDQ2131262
Publication date: 25 April 2022
Published in: The Annals of Statistics (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1903.08560
Asymptotic properties of parametric estimators (62F12) Ridge regression; shrinkage estimators (Lasso) (62J07) Linear regression; mixed models (62J05) Random matrices (probabilistic aspects) (60B20) General nonlinear regression (62J02) Learning and adaptive systems in artificial intelligence (68T05)
Related Items
Mehler’s Formula, Branching Process, and Compositional Kernels of Deep Neural Networks ⋮ Double Double Descent: On Generalization Errors in Transfer Learning between Linear Regression Tasks ⋮ Deep learning: a statistical viewpoint ⋮ Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation ⋮ Learning curves of generic features maps for realistic datasets with a teacher-student model* ⋮ Generalization error rates in kernel regression: the crossover from the noiseless to noisy regime* ⋮ On the proliferation of support vectors in high dimensions* ⋮ A precise high-dimensional asymptotic theory for boosting and minimum-\(\ell_1\)-norm interpolated classifiers ⋮ Benign overfitting in linear regression ⋮ Trading Signals in VIX Futures ⋮ Overparameterization and Generalization Error: Weighted Trigonometric Interpolation ⋮ Benefit of Interpolation in Nearest Neighbor Algorithms ⋮ Ridge-type linear shrinkage estimation of the mean matrix of a high-dimensional normal distribution ⋮ HARFE: hard-ridge random feature expansion ⋮ High dimensional binary classification under label shift: phase transition and regularization ⋮ Large-dimensional random matrix theory and its applications in deep learning and wireless communications ⋮ On the Inconsistency of Kernel Ridgeless Regression in Fixed Dimensions ⋮ Cross-Trait Prediction Accuracy of Summary Statistics in Genome-Wide Association Studies ⋮ Free dynamics of feature learning processes ⋮ A Universal Trade-off Between the Model Size, Test Loss, and Training Loss of Linear Predictors ⋮ Mini-workshop: Mathematical foundations of robust and generalizable learning. Abstracts from the mini-workshop held October 2--8, 2022 ⋮ Bayesian Conjugacy in Probit, Tobit, Multinomial Probit and Extensions: A Review and New Results ⋮ Smoothly varying regularization ⋮ Random neural networks in the infinite width limit as Gaussian processes ⋮ Stability of the scattering transform for deformations with minimal regularity ⋮ Universality of approximate message passing with semirandom matrices ⋮ High-Dimensional Analysis of Double Descent for Linear Regression with Random Projections ⋮ A Generalization Gap Estimation for Overparameterized Models via the Langevin Functional Variance ⋮ Universality of regularized regression estimators in high dimensions ⋮ Training-conditional coverage for distribution-free predictive inference ⋮ Benign Overfitting and Noisy Features ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Dimension independent excess risk by stochastic gradient descent ⋮ Precise statistical analysis of classification accuracies for adversarial training ⋮ On the robustness of minimum norm interpolators and regularized empirical risk minimizers ⋮ AdaBoost and robust one-bit compressed sensing ⋮ A Unifying Tutorial on Approximate Message Passing ⋮ A phase transition for finding needles in nonlinear haystacks with LASSO artificial neural networks ⋮ The interpolation phase transition in neural networks: memorization and generalization under lazy training ⋮ Prediction, Estimation, and Attribution ⋮ For interpolating kernel machines, minimizing the norm of the ERM solution maximizes stability ⋮ Prediction errors for penalized regressions based on generalized approximate message passing
Uses Software
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Spectral convergence for a general class of random matrices
- Eigenvectors of some large sample covariance matrix ensembles
- Anisotropic local laws for random matrices
- The spectrum of kernel random matrices
- A survey of cross-validation procedures for model selection
- Asymptotic optimality of \(C_ L\) and generalized cross-validation in ridge regression with application to spline smoothing
- Asymptotic optimality for \(C_ p\), \(C_ L\), cross-validation and generalized cross-validation: Discrete index set
- Smoothing noisy data with spline functions: Estimating the correct degree of smoothing by the method of generalized cross-validation
- The spectral norm of random inner-product kernel matrices
- High-dimensional asymptotics of prediction: ridge regression and classification
- Linearized two-layers neural networks in high dimension
- The distribution of the Lasso: uniform control over sparse balls and adaptive parameter tuning
- High-dimensional dynamics of generalization error in neural networks
- Gradient descent optimizes over-parameterized deep ReLU networks
- Just interpolate: kernel ``ridgeless regression can generalize
- THE SPECTRUM OF RANDOM INNER-PRODUCT KERNEL MATRICES
- Time-frequency localization operators: a geometric phase space approach
- Generalized Cross-Validation as a Method for Choosing a Good Ridge Parameter
- Atomic Decomposition by Basis Pursuit
- Consistent Risk Estimation in Moderately High-Dimensional Linear Regression
- A mean field view of the landscape of two-layer neural networks
- Generalisation error in learning with random features and the hidden manifold model*
- Two Models of Double Descent for Weak Features
- Binary Classification of Gaussian Mixtures: Abundance of Support Vectors, Benign Overfitting, and Regularization
- Benign overfitting in linear regression
- Reconciling modern machine-learning practice and the classical bias–variance trade-off
- Mean Field Analysis of Neural Networks: A Law of Large Numbers
- Nonlinear random matrix theory for deep learning
- Scaling description of generalization with number of parameters in deep learning
- Wide neural networks of any depth evolve as linear models under gradient descent *
- A jamming transition from under- to over-parametrization affects generalization in deep learning
- Deep learning: a statistical viewpoint
- Random Matrix Theory and Wireless Communications
- Ridge regression and asymptotic minimax estimation over spheres of growing dimension