Noisy replication in skewed binary classification. (Q1583065)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: Noisy replication in skewed binary classification. |
scientific article; zbMATH DE number 1521633
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | Noisy replication in skewed binary classification. |
scientific article; zbMATH DE number 1521633 |
Statements
Noisy replication in skewed binary classification. (English)
0 references
26 October 2000
0 references
Skewed binary classification problems arise in estimating the ``success'' probabilities of new observations due to sparse ``successes'' and numerous ``failures'' in a given training data set. Previously, the author [Comput. Stat. 14, 277--292 (1999; Zbl 0933.62050)] showed that adding small normal noise to replicate the ``successes'' in the training set could slightly improve estimates in several common classification models, namely, nearest neighbor, neural networks, classification trees, and quadratic discriminants. Now, we form much improved estimates for the same models: generating multiple versions of noise-added training sets from a given data set, we obtain an average of multiple model estimates. This model average is significantly improved both in terms of ROC area and Kullback-Leibler distance. In effect, the technique serves as an effective and model-free regularization for the classification models considered.
0 references
ROC curve
0 references
Kullback-Leibler distance
0 references