Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
An incremental classification algorithm for mining data with feature space heterogeneity - MaRDI portal

An incremental classification algorithm for mining data with feature space heterogeneity (Q1718147)

From MaRDI portal





scientific article; zbMATH DE number 7016183
Language Label Description Also known as
English
An incremental classification algorithm for mining data with feature space heterogeneity
scientific article; zbMATH DE number 7016183

    Statements

    An incremental classification algorithm for mining data with feature space heterogeneity (English)
    0 references
    0 references
    8 February 2019
    0 references
    Summary: Feature space heterogeneity often exists in many real world data sets so that some features are of different importance for classification over different subsets. Moreover, the pattern of feature space heterogeneity might dynamically change over time as more and more data are accumulated. In this paper, we develop an incremental classification algorithm, Supervised Clustering for Classification with Feature Space Heterogeneity (SCCFSH), to address this problem. In our approach, supervised clustering is implemented to obtain a number of clusters such that samples in each cluster are from the same class. After the removal of outliers, relevance of features in each cluster is calculated based on their variations in this cluster. The feature relevance is incorporated into distance calculation for classification. The main advantage of SCCFSH lies in the fact that it is capable of solving a classification problem with feature space heterogeneity in an incremental way, which is favorable for online classification tasks with continuously changing data. Experimental results on a series of data sets and application to a database marketing problem show the efficiency and effectiveness of the proposed approach.
    0 references

    Identifiers