Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
Minimum sample size to identify nonzero coefficients in normal regression - MaRDI portal

Minimum sample size to identify nonzero coefficients in normal regression (Q1905124)

From MaRDI portal





scientific article; zbMATH DE number 830592
Language Label Description Also known as
English
Minimum sample size to identify nonzero coefficients in normal regression
scientific article; zbMATH DE number 830592

    Statements

    Minimum sample size to identify nonzero coefficients in normal regression (English)
    0 references
    1 September 1996
    0 references
    Ordinary least squares (OLS) estimation of the vector of coefficients in \(n\)-variable linear normal regression requires an experimental design with \(O(n)\) points. The number of points in the experimental design can be substantially reduced, however, if it is known that only \(k\) out of \(n\) coefficients are nonzero, whereas the remaining coefficients are zero. The method proposed in the author's paper, Cybern. Syst. Anal. 29, No. 5, 716-726 (1993); translation from Kibern. Sist. Anal. 1993, No. 5, 104-115 (1993; Zbl 0814.62036), for instance, requires only \(O(\ln n)\) points as \(n \to \infty\) and \(k = \text{const.}\) The method consists of two stages. First a list of indices of the nonzero regression coefficients is created, which requires \(O(\ln n)\) points. In the second stage, the columns with the corresponding indices are extracted from the full observation matrix, a reduced observation matrix is formed from these columns, and the nonzero regression coefficients are estimated by applying the OLS method to the reduced observation matrix. For OLS estimation in the second stage to work, it is sufficient that the reduced observation matrix contains information about \(O(k)\) experimental points. For \(n \to \infty\) and \(k = \)const we have \(O(\ln n) \gg O(k)\), and the experimental design required to solve the entire problem is determined by the size constraint of the first stage, which is the main user of experimental information. Are there algorithms for which \(o(\ln n)\) experimental design points are sufficient to identify the nonzero coefficients of normal regression in the first stage? We prove that no such algorithms exist among passive probabilistic algorithms. The proof is presented for the case of a regression function in the form of a generalized polynomial.
    0 references
    ordinary least squares estimation
    0 references
    linear normal regression
    0 references
    reduced observation matrix
    0 references

    Identifiers