Turning Big Data Into Tiny Data: Constant-Size Coresets for $k$-Means, PCA, and Projective Clustering
From MaRDI portal
Publication:3304733
DOI10.1137/18M1209854zbMath1451.68244arXiv1807.04518MaRDI QIDQ3304733
Christian Sohler, Dan Feldman, Melanie Schmidt
Publication date: 3 August 2020
Published in: SIAM Journal on Computing (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1807.04518
Factor analysis and principal components; correspondence analysis (62H25) Classification and discrimination; cluster analysis (statistical aspects) (62H30) Analysis of algorithms (68W40) Statistical aspects of big data and data science (62R07) Computational aspects of data analysis and big data (68T09)
Related Items
Minimum cost‐compression risk in principal component analysis, Coresets for \((k, \ell ) \)-median clustering under the Fréchet distance, A novel method for optimizing spectral rotation embedding \(K\)-means with coordinate descent, On coresets for fair clustering in metric and Euclidean spaces and their applications, Turning Grain Maps into Diagrams, Lossy kernelization of same-size clustering, The spherical \(k\)-means++ algorithm via local search scheme, Improved local search algorithms for Bregman \(k\)-means and its variants, Lossy kernelization of same-size clustering
Uses Software
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions
- Improved and simplified inapproximability for \(k\)-means
- Relative \((p,\varepsilon )\)-approximations in geometry
- Efficient subspace approximation algorithms
- Clustering large graphs via the singular value decomposition
- The VC dimension of \(k\)-fold union
- Smaller coresets for \(k\)-median and \(k\)-means clustering
- NP-hardness of Euclidean sum-of-squares clustering
- On the complexity of locating linear facilities in the plane
- Largest \(j\)-simplices in \(n\)-polytopes
- Singular value decomposition and least squares solutions
- Frequent Directions: Simple and Deterministic Matrix Sketching
- BICO: BIRCH Meets Coresets for k-Means Clustering
- Dimensionality Reduction for k-Means Clustering and Low Rank Approximation
- Mergeable summaries
- Randomized Dimensionality Reduction for <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>-Means Clustering
- Approximating extent measures of points
- Randomized Algorithms for Matrices and Data
- Low-Rank Approximation and Regression in Input Sparsity Time
- Learnability and the Vapnik-Chervonenkis dimension
- On Coresets for k-Median and k-Means Clustering in Metric and Euclidean Spaces and Their Applications
- Linear-time approximation schemes for clustering problems in any dimensions
- On coresets for k-means and k-median clustering
- Coresets in dynamic geometric data streams
- Adaptive Sampling and Fast Low-Rank Matrix Approximation
- A PTAS for k-means clustering based on weak coresets
- The Planar k-Means Problem is NP-Hard
- Adaptive Sampling for k-Means Clustering
- Decomposable searching problems I. Static-to-dynamic transformation
- On the Early History of the Singular Value Decomposition
- Optimal Approximate Matrix Product in Terms of Stable Rank
- Local Search Yields a PTAS for $k$-Means in Doubling Metrics
- Local Search Yields Approximation Schemes for $k$-Means and $k$-Median in Euclidean and Minor-Free Metrics
- Twice-Ramanujan Sparsifiers
- Numerical linear algebra in the streaming model
- A fast and efficient algorithm for low-rank approximation of a matrix
- Coresets for Discrete Integration and Clustering
- StreamKM++
- A unified framework for approximating and clustering data
- FSTTCS 2004: Foundations of Software Technology and Theoretical Computer Science
- Lower Bounds for Approximation by Nonlinear Manifolds
- Calculating the Singular Values and Pseudo-Inverse of a Matrix
- Turning Big data into tiny data: Constant-size coresets for k-means, PCA and projective clustering
- FSTTCS 2005: Foundations of Software Technology and Theoretical Computer Science
- Improved bounds on the sample complexity of learning