Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
CNODE: clustering of set-valued non-ordered discrete data - MaRDI portal

CNODE: clustering of set-valued non-ordered discrete data (Q1046594)

From MaRDI portal





scientific article; zbMATH DE number 5651387
Language Label Description Also known as
English
CNODE: clustering of set-valued non-ordered discrete data
scientific article; zbMATH DE number 5651387

    Statements

    CNODE: clustering of set-valued non-ordered discrete data (English)
    0 references
    0 references
    0 references
    0 references
    0 references
    22 December 2009
    0 references
    Summary: This paper introduces a clustering technique named `Clustering of set-valued Non-Ordered DiscretE data' (CNODE), in which each data item is a vector having a set of non-ordered discrete values per dimension. Since usual definitions of distance like Euclidean and Manhattan do not hold for `non-ordered discrete data space' (NDDS), other measures like Hamming distance are often used to define distance between vectors having single-valued discrete dimensions. Such type of distance is not meaningful for set-valued dimensions and hence, we propose a similarity measure based on set intersection for clustering set-valued vectors. We also suggest a new measure for determining quality of clustering named `lines of clustroids' for this type of data. In contrast to other existing clustering techniques in NDDS, CNODE does not rely on any kind of pre-processing of dataset. Experiments with synthetic and real datasets show that CNODE is robust to data variations, scalable to large dataset size and efficient for high dimensions.
    0 references
    clustering
    0 references
    set-valued data
    0 references
    non-ordered discrete data
    0 references
    categorical data
    0 references
    intersection coefficient
    0 references
    clustroids
    0 references

    Identifiers