499
Views
0
CrossRef citations to date
0
Altmetric
Statistical Learning

Exactly Uncorrelated Sparse Principal Component Analysis

, &
Pages 231-241 | Received 28 Sep 2022, Accepted 28 Jun 2023, Published online: 30 Aug 2023
 

Abstract

Sparse principal component analysis (PCA) aims to find principal components as linear combinations of a subset of the original input variables without sacrificing the fidelity of the classical PCA. Most existing sparse PCA methods produce correlated sparse principal components. We argue that many applications of PCA prefer uncorrelated principal components. However, handling sparsity and uncorrelatedness properties in a sparse PCA method is nontrivial. This article proposes an exactly uncorrelated sparse PCA method named EUSPCA, whose formulation is motivated by original views and motivations of PCA as advocated by Pearson and Hotelling. EUSPCA is a non-smooth constrained non-convex manifold optimization problem. We solve it by combining augmented Lagrangian and non-monotone proximal gradient methods. We observe that EUSPCA produces uncorrelated components and maintains a similar or better level of fidelity based on adjusted total variance through simulated and real data examples. In contrast, existing sparse PCA methods produce significantly correlated components. Supplemental materials for this article are available online.

Supplementary Materials

Online appendix: Online appendix consists of three sections. Appendix A includes proofs of the theorems. In Appendix B, an additional discussion on EUSPCA and related methods is presented. Appendix C provides curves of total adjusted variance versus sparsity on the two real data.

R-package: R-package euspca, available at https://github.com/ohrankwon/euspca, contains the codes for solving EUSPCA with l1 regularization using the mixture of augmented Lagrangian and non-monotone proximal gradient methods.

Acknowledgments

The authors would like to thank the editor, the AE, and two anonymous reviewers for constructive comments, which have improved the quality of the article.

Disclosure Statement

The authors report there are no competing interests to declare.

Additional information

Funding

Lu’s research is supported in part by NSF IIS-2211491. Zou’s research is supported in part by NSF DMS-1915842 and 2015120.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.