Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
Improvement of the predicted least-squares-estimates in multiple linear regression by means of an examination of residuals - MaRDI portal

Improvement of the predicted least-squares-estimates in multiple linear regression by means of an examination of residuals (Q760991)

From MaRDI portal





scientific article; zbMATH DE number 3886904
Language Label Description Also known as
English
Improvement of the predicted least-squares-estimates in multiple linear regression by means of an examination of residuals
scientific article; zbMATH DE number 3886904

    Statements

    Improvement of the predicted least-squares-estimates in multiple linear regression by means of an examination of residuals (English)
    0 references
    0 references
    0 references
    1984
    0 references
    Using conventional least-squares-estimates \(\hat y_ i\) of a multiple linear regression for prediction-purposes the positive correlation between the original values \(y_ i\) and the residuals \(y_ i-\hat y_ i\) often brings about considerable disadvantages. A correction of \(\hat y_ i\) implies an examination of the residuals \(y_ i-\hat y_ i.\) In this paper we propose an iterative procedure for the correction and improvement of the original uncorrected \(\hat y_ i\equiv \hat y_ i^{(1)}:\) Because of the correlation between \(y_ i\) and \(y_ i-\hat y_ i\) a first regression of the \(y_ i-\hat y_ i\) on the \(y_ i\) leads to corrected values \(\hat y^*_ i\), which are again positively correlated with the \(\hat y_ i\). A second regression of \(\hat y^*_ i\) on the \(\hat y_ i\) gives the estimates \(\hat y_ i^{(new)}\). This transition from \(\hat y_ i\) to \(\hat y_ i^{(new)}\) can be iterated infinitely. After the n-th iteration-step one obtains the estimates: \[ \hat y_ i^{(n+1)}=\bar y+R^{-1}(1-(1-R)^{n+1})(\hat y_ i^{(1)}-\bar y) \] where R denotes the coefficient of determination (square of the multiple correlation coefficient) of the original multiple linear regression and \(\bar y=mean\) of the \(y_ i's.\) The goodness-of-fit between these estimates \(\hat y_ i^{(n+1)}\) and the \(y_ i\) as well as the unfavourable correlation \(r^{(n+1)}\) between \(y_ i\) and the residuals \(y_ i-\hat y_ i^{(n+1)}\) decrease with each additional iteration. That means: \[ \sum_{i}[y_ i-\hat y_ i^{(n+1)}]^ 2=SQ^{(n+1)}\geq SQ^{(n)}=\sum_{i}[y_ i-\hat y_ i^{(n)}]^ 2 \] \[ with\quad SQ^{(n+1)}/SQ^{(1)}=1+(1- R)[1-(1-R)^ n]^ 2/R\quad and \] \[ r^{(n+1)}\leq r^{(n)}\quad with\quad r^{(n+1)}/r^{(1)}=\frac{(1-R)^ n\cdot \sqrt{R}}{\sqrt{(1- R)[1-(1-R)^ n]^ 2+R}}. \] Therefore ''loss of fit'' and ''gain of correlation'' are opposite properties of this proposed procedure, which may be of special interest if the ''loss of fit'' remains sufficiently small while the unfavourable correlation decreases remarkably. The ''practicable number of iterations'' which depends on two parameters has been calculated explicitly. All these different approaches and results are investigated and discussed in detail and are finally computed for a numerical economic example of ''farm-income'' dependent on ''size of farm'', ''size of dairy'' and ''number of men''.
    0 references
    improvement of predicted least-squares-estimates
    0 references
    multiple linear regression
    0 references
    positive correlation
    0 references
    residuals
    0 references
    iterative procedure
    0 references
    coefficient of determination
    0 references
    goodness-of-fit
    0 references
    loss of fit
    0 references
    gain of correlation
    0 references

    Identifiers