Sketched Ridgeless Linear Regression: The Role of Downsampling
From MaRDI portal
Publication:6425220
arXiv2302.01088MaRDI QIDQ6425220
Author name not available (Why is that?)
Publication date: 2 February 2023
Abstract: Overparametrization often helps improve the generalization performance. This paper proposes a dual view of overparametrization suggesting that downsampling may also help generalize. Motivated by this dual view, we characterize two out-of-sample prediction risks of the sketched ridgeless least square estimator in the proportional regime , where is the sketching size, the sample size, and the feature dimensionality. Our results reveal the statistical role of downsampling. Specifically, downsampling does not always hurt the generalization performance, and may actually help improve it in some cases. We identify the optimal sketching sizes that minimize the out-of-sample prediction risks, and find that the optimally sketched estimator has stabler risk curves that eliminates the peaks of those for the full-sample estimator. We then propose a practical procedure to empirically identify the optimal sketching size. Finally, we extend our results to cover central limit theorems and misspecified models. Numerical studies strongly support our theory.
Has companion code repository: https://github.com/statsle/srlr_python
This page was built for publication: Sketched Ridgeless Linear Regression: The Role of Downsampling
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6425220)