Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
Minimum discrepancy principle strategy for choosing $k$ in $k$-NN regression - MaRDI portal

Deprecated: Use of MediaWiki\Skin\SkinTemplate::injectLegacyMenusIntoPersonalTools was deprecated in Please make sure Skin option menus contains `user-menu` (and possibly `notifications`, `user-interface-preferences`, `user-page`) 1.46. [Called from MediaWiki\Skin\SkinTemplate::getPortletsTemplateData in /var/www/html/w/includes/Skin/SkinTemplate.php at line 691] in /var/www/html/w/includes/Debug/MWDebug.php on line 372

Deprecated: Use of QuickTemplate::(get/html/text/haveData) with parameter `personal_urls` was deprecated in MediaWiki Use content_navigation instead. [Called from MediaWiki\Skin\QuickTemplate::get in /var/www/html/w/includes/Skin/QuickTemplate.php at line 131] in /var/www/html/w/includes/Debug/MWDebug.php on line 372

Minimum discrepancy principle strategy for choosing $k$ in $k$-NN regression

From MaRDI portal
Publication:6347398

arXiv2008.08718MaRDI QIDQ6347398

Author name not available (Why is that?)

Publication date: 19 August 2020

Abstract: We present a novel data-driven strategy to choose the hyperparameter k in the k-NN regression estimator. We treat the problem of choosing the hyperparameter as an iterative procedure (over k) and propose using an easily implemented in practice strategy based on the idea of early stopping and the minimum discrepancy principle. This model selection strategy is proven to be minimax-optimal, under the fixed-design assumption on covariates, over some smoothness function classes, for instance, the Lipschitz functions class on a bounded domain. The novel method often improves statistical performance on artificial and real-world data sets in comparison to other model selection strategies, such as the Hold-out method and 5-fold cross-validation. The novelty of the strategy comes from reducing the computational time of the model selection procedure while preserving the statistical (minimax) optimality of the resulting estimator. More precisely, given a sample of size n, assuming that the nearest neighbors are already precomputed, if one should choose k among left1,ldots,night, the strategy reduces the computational time of the generalized cross-validation or Akaike's AIC criteria from mathcalOleft(n3ight) to mathcalOleft(n2(nk)ight), where k is the proposed (minimum discrepancy principle) value of the nearest neighbors. Code for the simulations is provided at https://github.com/YaroslavAveryanov/Minimum-discrepancy-principle-for-choosing-k.




Has companion code repository: https://github.com/YaroslavAveryanov/Minimum-discrepancy-principle-for-choosing-k








This page was built for publication: Minimum discrepancy principle strategy for choosing $k$ in $k$-NN regression

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6347398)