Improvement of the predicted least-squares-estimates in multiple linear regression by means of an examination of residuals (Q760991)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: Improvement of the predicted least-squares-estimates in multiple linear regression by means of an examination of residuals |
scientific article; zbMATH DE number 3886904
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | Improvement of the predicted least-squares-estimates in multiple linear regression by means of an examination of residuals |
scientific article; zbMATH DE number 3886904 |
Statements
Improvement of the predicted least-squares-estimates in multiple linear regression by means of an examination of residuals (English)
0 references
1984
0 references
Using conventional least-squares-estimates \(\hat y_ i\) of a multiple linear regression for prediction-purposes the positive correlation between the original values \(y_ i\) and the residuals \(y_ i-\hat y_ i\) often brings about considerable disadvantages. A correction of \(\hat y_ i\) implies an examination of the residuals \(y_ i-\hat y_ i.\) In this paper we propose an iterative procedure for the correction and improvement of the original uncorrected \(\hat y_ i\equiv \hat y_ i^{(1)}:\) Because of the correlation between \(y_ i\) and \(y_ i-\hat y_ i\) a first regression of the \(y_ i-\hat y_ i\) on the \(y_ i\) leads to corrected values \(\hat y^*_ i\), which are again positively correlated with the \(\hat y_ i\). A second regression of \(\hat y^*_ i\) on the \(\hat y_ i\) gives the estimates \(\hat y_ i^{(new)}\). This transition from \(\hat y_ i\) to \(\hat y_ i^{(new)}\) can be iterated infinitely. After the n-th iteration-step one obtains the estimates: \[ \hat y_ i^{(n+1)}=\bar y+R^{-1}(1-(1-R)^{n+1})(\hat y_ i^{(1)}-\bar y) \] where R denotes the coefficient of determination (square of the multiple correlation coefficient) of the original multiple linear regression and \(\bar y=mean\) of the \(y_ i's.\) The goodness-of-fit between these estimates \(\hat y_ i^{(n+1)}\) and the \(y_ i\) as well as the unfavourable correlation \(r^{(n+1)}\) between \(y_ i\) and the residuals \(y_ i-\hat y_ i^{(n+1)}\) decrease with each additional iteration. That means: \[ \sum_{i}[y_ i-\hat y_ i^{(n+1)}]^ 2=SQ^{(n+1)}\geq SQ^{(n)}=\sum_{i}[y_ i-\hat y_ i^{(n)}]^ 2 \] \[ with\quad SQ^{(n+1)}/SQ^{(1)}=1+(1- R)[1-(1-R)^ n]^ 2/R\quad and \] \[ r^{(n+1)}\leq r^{(n)}\quad with\quad r^{(n+1)}/r^{(1)}=\frac{(1-R)^ n\cdot \sqrt{R}}{\sqrt{(1- R)[1-(1-R)^ n]^ 2+R}}. \] Therefore ''loss of fit'' and ''gain of correlation'' are opposite properties of this proposed procedure, which may be of special interest if the ''loss of fit'' remains sufficiently small while the unfavourable correlation decreases remarkably. The ''practicable number of iterations'' which depends on two parameters has been calculated explicitly. All these different approaches and results are investigated and discussed in detail and are finally computed for a numerical economic example of ''farm-income'' dependent on ''size of farm'', ''size of dairy'' and ''number of men''.
0 references
improvement of predicted least-squares-estimates
0 references
multiple linear regression
0 references
positive correlation
0 references
residuals
0 references
iterative procedure
0 references
coefficient of determination
0 references
goodness-of-fit
0 references
loss of fit
0 references
gain of correlation
0 references