Minimum sample size to identify nonzero coefficients in normal regression (Q1905124)

Ordinary least squares (OLS) estimation of the vector of coefficients in \(n\)-variable linear normal regression requires an experimental design with \(O(n)\) points. The number of points in the experimental design can be substantially reduced, however, if it is known that only \(k\) out of \(n\) coefficients are nonzero, whereas the remaining coefficients are zero. The method proposed in the author's paper, Cybern. Syst. Anal. 29, No. 5, 716-726 (1993); translation from Kibern. Sist. Anal. 1993, No. 5, 104-115 (1993; Zbl 0814.62036), for instance, requires only \(O(\ln n)\) points as \(n \to \infty\) and \(k = \text{const.}\) The method consists of two stages. First a list of indices of the nonzero regression coefficients is created, which requires \(O(\ln n)\) points. In the second stage, the columns with the corresponding indices are extracted from the full observation matrix, a reduced observation matrix is formed from these columns, and the nonzero regression coefficients are estimated by applying the OLS method to the reduced observation matrix. For OLS estimation in the second stage to work, it is sufficient that the reduced observation matrix contains information about \(O(k)\) experimental points. For \(n \to \infty\) and \(k = \)const we have \(O(\ln n) \gg O(k)\), and the experimental design required to solve the entire problem is determined by the size constraint of the first stage, which is the main user of experimental information. Are there algorithms for which \(o(\ln n)\) experimental design points are sufficient to identify the nonzero coefficients of normal regression in the first stage? We prove that no such algorithms exist among passive probabilistic algorithms. The proof is presented for the case of a regression function in the form of a generalized polynomial.

0 references

zbMATH Keywords

ordinary least squares estimation

0 references

linear normal regression

0 references

reduced observation matrix

0 references

MaRDI profile type

Publication

0 references

cites work