Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
Finite sample rates for logistic regression with small noise or few samples - MaRDI portal

Deprecated: Use of MediaWiki\Skin\SkinTemplate::injectLegacyMenusIntoPersonalTools was deprecated in Please make sure Skin option menus contains `user-menu` (and possibly `notifications`, `user-interface-preferences`, `user-page`) 1.46. [Called from MediaWiki\Skin\SkinTemplate::getPortletsTemplateData in /var/www/html/w/includes/Skin/SkinTemplate.php at line 691] in /var/www/html/w/includes/Debug/MWDebug.php on line 372

Deprecated: Use of MediaWiki\Skin\BaseTemplate::getPersonalTools was deprecated in 1.46 Call $this->getSkin()->getPersonalToolsForMakeListItem instead (T422975). [Called from Skins\Chameleon\Components\NavbarHorizontal\PersonalTools::getHtml in /var/www/html/w/skins/chameleon/src/Components/NavbarHorizontal/PersonalTools.php at line 66] in /var/www/html/w/includes/Debug/MWDebug.php on line 372

Deprecated: Use of QuickTemplate::(get/html/text/haveData) with parameter `personal_urls` was deprecated in MediaWiki Use content_navigation instead. [Called from MediaWiki\Skin\QuickTemplate::get in /var/www/html/w/includes/Skin/QuickTemplate.php at line 131] in /var/www/html/w/includes/Debug/MWDebug.php on line 372

Finite sample rates for logistic regression with small noise or few samples

From MaRDI portal
Publication:6510281

arXiv2305.15991MaRDI QIDQ6510281

Author name not available (Why is that?)


Abstract: The logistic regression estimator is known to inflate the magnitude of its coefficients if the sample size n is small, the dimension p is (moderately) large or the signal-to-noise ratio 1/sigma is large (probabilities of observing a label are close to 0 or 1). With this in mind, we study the logistic regression estimator with plln/logn, assuming Gaussian covariates and labels generated by the Gaussian link function, with a mild optimization constraint on the estimator's length to ensure existence. We provide finite sample guarantees for its direction, which serves as a classifier, and its Euclidean norm, which is an estimator for the signal-to-noise ratio. We distinguish between two regimes. In the low-noise/small-sample regime (nsigmalesssimplogn), we show that the estimator's direction (and consequentially the classification error) achieve the rate (plogn)/n - as if the problem was noiseless. In this case, the norm of the estimator is at least of order n/(plogn). If instead nsigmagtrsimplogn, the estimator's direction achieves the rate sqrtsigmaplogn/n, whereas its norm converges to the true norm at the rate sqrtplogn/(nsigma3). As a corollary, the data are not linearly separable with high probability in this regime. The logistic regression estimator allows to conclude which regime occurs with high probability. Therefore, inference for logistic regression is possible in the regime nsigmagtrsimplogn. In either case, logistic regression provides a competitive classifier.












This page was built for publication: Finite sample rates for logistic regression with small noise or few samples

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6510281)