Noisy replication in skewed binary classification. (Q1583065)

From MaRDI portal





scientific article; zbMATH DE number 1521633
Language Label Description Also known as
English
Noisy replication in skewed binary classification.
scientific article; zbMATH DE number 1521633

    Statements

    Noisy replication in skewed binary classification. (English)
    0 references
    0 references
    26 October 2000
    0 references
    Skewed binary classification problems arise in estimating the ``success'' probabilities of new observations due to sparse ``successes'' and numerous ``failures'' in a given training data set. Previously, the author [Comput. Stat. 14, 277--292 (1999; Zbl 0933.62050)] showed that adding small normal noise to replicate the ``successes'' in the training set could slightly improve estimates in several common classification models, namely, nearest neighbor, neural networks, classification trees, and quadratic discriminants. Now, we form much improved estimates for the same models: generating multiple versions of noise-added training sets from a given data set, we obtain an average of multiple model estimates. This model average is significantly improved both in terms of ROC area and Kullback-Leibler distance. In effect, the technique serves as an effective and model-free regularization for the classification models considered.
    0 references
    ROC curve
    0 references
    Kullback-Leibler distance
    0 references
    0 references
    0 references
    0 references

    Identifiers