Just interpolate: kernel ``ridgeless regression can generalize

DOI10.1214/19-AOS1849zbMath1453.68155arXiv1808.00387OpenAlexW3104969455MaRDI QIDQ2196223

Publication date: 28 August 2020

Published in: The Annals of Statistics (Search for Journal in Brave)

Full work available at URL: https://arxiv.org/abs/1808.00387

zbMATH Keywords

reproducing kernel Hilbert spaces high dimensionality kernel methods implicit regularization data-dependent bounds minimum-norm interpolation spectral decay

Mathematics Subject Classification ID

Nonparametric regression and quantile regression (62G08) Learning and adaptive systems in artificial intelligence (68T05) Hilbert spaces with reproducing kernels (= (proper) functional Hilbert spaces, including de Branges-Rovnyak and other structured spaces) (46E22)

Related Items (39)

Canonical thresholding for nonsparse high-dimensional linear regression ⋮ Mehler’s Formula, Branching Process, and Compositional Kernels of Deep Neural Networks ⋮ Deep Neural Networks, Generic Universal Interpolation, and Controlled ODEs ⋮ Deep learning: a statistical viewpoint ⋮ Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation ⋮ Communication-efficient distributed estimator for generalized linear models with a diverging number of covariates ⋮ Surprises in high-dimensional ridgeless least squares interpolation ⋮ Generalization error of random feature and kernel methods: hypercontractivity and kernel matrix concentration ⋮ On the proliferation of support vectors in high dimensions* ⋮ Learning from non-random data in Hilbert spaces: an optimal recovery perspective ⋮ A precise high-dimensional asymptotic theory for boosting and minimum-\(\ell_1\)-norm interpolated classifiers ⋮ Theoretical issues in deep networks ⋮ Benign overfitting in linear regression ⋮ Overparameterization and Generalization Error: Weighted Trigonometric Interpolation ⋮ Improved complexities for stochastic conditional gradient methods under interpolation-like conditions ⋮ Training Neural Networks as Learning Data-adaptive Kernels: Provable Representation and Approximation Benefits ⋮ HARFE: hard-ridge random feature expansion ⋮ On the Inconsistency of Kernel Ridgeless Regression in Fixed Dimensions ⋮ A Universal Trade-off Between the Model Size, Test Loss, and Training Loss of Linear Predictors ⋮ SVRG meets AdaGrad: painless variance reduction ⋮ Unnamed Item ⋮ Benign Overfitting and Noisy Features ⋮ Tractability from overparametrization: the example of the negative perceptron ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Linearized two-layers neural networks in high dimension ⋮ Diversity Sampling is an Implicit Regularization for Kernel Methods ⋮ Generalization Error of Minimum Weighted Norm and Kernel Interpolation ⋮ Unnamed Item ⋮ On the robustness of minimum norm interpolators and regularized empirical risk minimizers ⋮ A Multi-resolution Theory for Approximating Infinite-p-Zero-n: Transitional Inference, Individualized Predictions, and a World Without Bias-Variance Tradeoff ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Multilevel Fine-Tuning: Closing Generalization Gaps in Approximation of Solution Maps under a Limited Budget for Training Data ⋮ A Unifying Tutorial on Approximate Message Passing ⋮ The interpolation phase transition in neural networks: memorization and generalization under lazy training ⋮ A sieve stochastic gradient descent estimator for online nonparametric regression in Sobolev ellipsoids ⋮ A random matrix analysis of random Fourier features: beyond the Gaussian kernel, a precise phase transition, and the corresponding double descent* ⋮ For interpolating kernel machines, minimizing the norm of the ERM solution maximizes stability

Uses Software

Cites Work

This page was built for publication: Just interpolate: kernel ``ridgeless regression can generalize