CNODE: clustering of set-valued non-ordered discrete data (Q1046594)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: CNODE: clustering of set-valued non-ordered discrete data |
scientific article; zbMATH DE number 5651387
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | CNODE: clustering of set-valued non-ordered discrete data |
scientific article; zbMATH DE number 5651387 |
Statements
CNODE: clustering of set-valued non-ordered discrete data (English)
0 references
22 December 2009
0 references
Summary: This paper introduces a clustering technique named `Clustering of set-valued Non-Ordered DiscretE data' (CNODE), in which each data item is a vector having a set of non-ordered discrete values per dimension. Since usual definitions of distance like Euclidean and Manhattan do not hold for `non-ordered discrete data space' (NDDS), other measures like Hamming distance are often used to define distance between vectors having single-valued discrete dimensions. Such type of distance is not meaningful for set-valued dimensions and hence, we propose a similarity measure based on set intersection for clustering set-valued vectors. We also suggest a new measure for determining quality of clustering named `lines of clustroids' for this type of data. In contrast to other existing clustering techniques in NDDS, CNODE does not rely on any kind of pre-processing of dataset. Experiments with synthetic and real datasets show that CNODE is robust to data variations, scalable to large dataset size and efficient for high dimensions.
0 references
clustering
0 references
set-valued data
0 references
non-ordered discrete data
0 references
categorical data
0 references
intersection coefficient
0 references
clustroids
0 references