Feature informativeness, curse-of-dimensionality and error probability in discriminant analysis (Q2735975)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: Feature informativeness, curse-of-dimensionality and error probability in discriminant analysis |
scientific article; zbMATH DE number 1637162
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | Feature informativeness, curse-of-dimensionality and error probability in discriminant analysis |
scientific article; zbMATH DE number 1637162 |
Statements
26 August 2001
0 references
dimensionality in discriminant analysis
0 references
informativeness
0 references
curse-of-di\-men\-si\-o\-na\-li\-ty effect
0 references
growing dimension asymptotic approach
0 references
discriminant function
0 references
evaluation
0 references
feature selection
0 references
distance measure
0 references
weighted discriminant function
0 references
minimum error probability
0 references
Feature informativeness, curse-of-dimensionality and error probability in discriminant analysis (English)
0 references
This thesis treats problems of discriminant analysis in a high-dimensional setting. Basic facts of discriminant analysis are presented together with a brief review of the developments of the subject, with focus on the ways in which the curse-of-dimensionality phenomenon is reflected in the precision of discrimination. The problem of feature selection in discriminant analysis is considered and a survey of different techniques is presented. A growing dimension asymptotic approach is considered as a tool for treating the curse-of-dimensionality effect on the performance of discrimination. This approach makes it possible to establish limiting expressions for the error probabilities and thereby analytically evaluate the effect of dimensionality on the error rate.NEWLINENEWLINEA consistent approximation to the likelihood based discriminant function is proposed for the case when the dimensionality is comparable to the sample size. The concept of informativeness of a set of features and its effect on the precision of discrimination are discussed. A distance measure, giving different weights to different sets of features, is proposed as the feature evaluation tool. An optimal (in a sense of minimum error probability) type of weight function is established and the weighting scheme is illustrated by certain examples. The latter justify the appropriateness of the proposed weighting and show that in a high-dimensional case the weighted discriminant function produces lower error probabilities than the usual one. The weighting technique is elaborated by using an estimation procedure in the feature evaluation.NEWLINENEWLINEThe presence of high-dimensional features is shown to lead to overestimation of their informativeness, which increases the error probability thereby reflecting the curse-of-dimensionality effect. The explicit form of the weight function, which provides the minimum of the limiting error probability when weighting by estimates, is found. A feature selection procedure in high-dimensional discriminant analysis is commonly used when measuring all features relevant to the discriminant analysis is ``expensive''. A threshold based feature selection technique is introduced. It is incorporated into the discriminant function by means of an inclusion-exclusion factor which eliminates the sets of features whose informativeness does not exceed a given threshold. An issue is how this type of selection, combined with high dimensionality, affects the precision of discrimination.
0 references