522
Views
0
CrossRef citations to date
0
Altmetric
Sparse Learning

Scalable Model-Free Feature Screening via Sliced-Wasserstein Dependency

, ORCID Icon & ORCID Icon
Pages 1501-1511 | Received 20 Oct 2022, Accepted 16 Jan 2023, Published online: 12 Apr 2023
 

Abstract

We consider the model-free feature screening problem that aims to discard non-informative features before downstream analysis. Most of the existing feature screening approaches have at least quadratic computational cost with respect to the sample size n, thus, may suffer from a huge computational burden when n is large. To alleviate the computational burden, we propose a scalable model-free sure independence screening approach. This approach is based on the so-called sliced-Wasserstein dependency, a novel metric that measures the dependence between two random variables. Specifically, we quantify the dependence between two random variables by measuring the sliced-Wasserstein distance between their joint distribution and the product of their marginal distributions. For a predictor matrix of size n × d, the computational cost for the proposed algorithm is at the order of O(nlog(n)d), even when the response variable is multivariate. Theoretically, we show the proposed method enjoys both sure screening and rank consistency properties under mild regularity conditions. Numerical studies on various synthetic and real-world datasets demonstrate the superior performance of the proposed method in comparison with mainstream competitors, requiring significantly less computational time. Supplementary materials for this article are available online.

Supplementary Materials

Appendix: contains the complete proofs of the theoretical results; and additional experiments including two real data examples, simulation results based on the second criterion, and feature screening results for categorically distributed features and response. (appendix.pdf, a pdf file)

Code: contains R code that implements the proposed method and reproduces the numerical results. A readme file is included describing the contents. (code.zip, a zip file)

Acknowledgments

We appreciate the Editor, Associate Editor, and two anonymous reviewers for their constructive comments that helped improve the work.

Disclosure Statement

The authors report there are no competing interests to declare.

Additional information

Funding

The authors would like to acknowledge the support from National Natural Science Foundation of China grant No.12101606, No.12001042, No.12271522, and Renmin University of China research fund program for young scholars, and Beijing Institute of Technology research fund program for young scholars.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 180.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.