Sparse PCA: a Geometric Approach

From MaRDI portal
Publication:6413772

arXiv2210.06590MaRDI QIDQ6413772

Author name not available (Why is that?)

Publication date: 12 October 2022

Abstract: We consider the problem of maximizing the variance explained from a data matrix using orthogonal sparse principal components that have a support of fixed cardinality. While most existing methods focus on building principal components (PCs) iteratively through deflation, we propose GeoSPCA, a novel algorithm to build all PCs at once while satisfying the orthogonality constraints which brings substantial benefits over deflation. This novel approach is based on the left eigenvalues of the covariance matrix which helps circumvent the non-convexity of the problem by approximating the optimal solution using a binary linear optimization problem that can find the optimal solution. The resulting approximation can be used to tackle different versions of the sparse PCA problem including the case in which the principal components share the same support or have disjoint supports and the Structured Sparse PCA problem. We also propose optimality bounds and illustrate the benefits of GeoSPCA in selected real world problems both in terms of explained variance, sparsity and tractability. Improvements vs. the greedy algorithm, which is often at par with state-of-the-art techniques, reaches up to 24% in terms of variance while solving real world problems with 10,000s of variables and support cardinality of 100s in minutes. We also apply GeoSPCA in a face recognition problem yielding more than 10% improvement vs. other PCA based technique such as structured sparse PCA.




Has companion code repository: https://github.com/DLKitane/GEOSPCA








This page was built for publication: Sparse PCA: a Geometric Approach

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6413772)