Views

CrossRef citations to date

Altmetric

Statistical Learning

Accelerated and Interpretable Oblique Random Survival Forests

Byron C. Jaegera Department of Biostatistics and Data Science, Wake Forest University School of Medicine, Winston-Salem, NCCorrespondence[email protected]
View further author information

Sawyer Weldena Department of Biostatistics and Data Science, Wake Forest University School of Medicine, Winston-Salem, NCView further author information

Kristin Lenoira Department of Biostatistics and Data Science, Wake Forest University School of Medicine, Winston-Salem, NC

https://orcid.org/0000-0003-1834-7398 View further author information

Jaime L. Speisera Department of Biostatistics and Data Science, Wake Forest University School of Medicine, Winston-Salem, NCView further author information

Matthew W. Segarb Department of Cardiology, Texas Heart Institute, Houston, TXView further author information

Ambarish Pandeyc Division of Cardiology, Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, TXView further author information

Nicholas M. Pajewskia Department of Biostatistics and Data Science, Wake Forest University School of Medicine, Winston-Salem, NC

https://orcid.org/0000-0002-4447-6196 View further author information

show all

Abstract

The oblique random survival forest (RSF) is an ensemble supervised learning method for right-censored outcomes. Trees in the oblique RSF are grown using linear combinations of predictors, whereas in the standard RSF, a single predictor is used. Oblique RSF ensembles have high prediction accuracy, but assessing many linear combinations of predictors induces high computational overhead. In addition, few methods have been developed for estimation of variable importance (VI) with oblique RSFs. We introduce a method to increase computational efficiency of the oblique RSF and a method to estimate VI with the oblique RSF. Our computational approach uses Newton-Raphson scoring in each non-leaf node, We estimate VI by negating each coefficient used for a given predictor in linear combinations, and then computing the reduction in out-of-bag accuracy. In benchmarking experiments, we find our implementation of the oblique RSF is hundreds of times faster, with equivalent prediction accuracy, compared to existing software for oblique RSFs. We find in simulation studies that “negation VI” discriminates between relevant and irrelevant numeric predictors more accurately than permutation VI, Shapley VI, and a technique to measure VI using analysis of variance. All oblique RSF methods in the current study are available in the aorsf R package, and additional supplemental materials are available online.

KEYWORDS:

Supplementary Materials

Supplemental materials for the current analysis are available online.

Code The code used to generate results in the current analysis are available online at

https://github.com/bcjaeger/aorsf-bench and provided as supplemental material (aorsf_bench.zip).

aorsf The R package used to fit oblique RSFs in the current analysis is available online at

https://github.com/ropensci/aorsf and version 0.0.7.9000 is provided as supplemental material (aorsf_package.zip).

Disclosure Statement

No potential conflict of interest was reported by the authors.

Notes

1 Menze et al. (Citation2011) name their method “oblique RF VI,” but we use the name ‘ANOVA VI’ in this article to avoid confusing Menze’s approach with other approaches to estimate VI for oblique RFs.

2 The aorsf package automatically scales numeric inputs to a mean of zero and standard deviation of one.

3 The aorsf package enables customized functions to be applied in lieu of the default C-statistic.

4 For example, when the prediction task was to predict risk of death in the ACTG 320 clinical trial (26 events total), some splits did not leave enough events in the training data to fit complex learners such as neural networks

5 Although the party package implements the approach to VI developed by Strobl et al. (Citation2007), the developers of the party package note that the implementation of this approach for survival outcomes is “extremely slow and experimental” as of version 1.3.10. Therefore, it is not incorporated in the current simulation study.

Additional information

Funding

The authors gratefully acknowledge the Center for Biomedical Informatics, Wake Forest University School of Medicine for supporting this research. The project was also supported by the National Center for Advancing Translational Sciences (NCATS), National Institutes of Health through Grant Award Number UL1TR001420. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. Dr. Pajewski was supported by grant number P30AG021332 from the National Institutes of Health, while Dr. Speiser was supported by K25AG068253.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Accelerated and Interpretable Oblique Random Survival Forests

Information for

Open access

Opportunities

Help and information

Accelerated and Interpretable Oblique Random Survival Forests

Abstract

Supplementary Materials

Disclosure Statement

Notes

Additional information

Funding

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature