771
Views
1
CrossRef citations to date
0
Altmetric
Bayesian Methods

Robust Transformations for Multiple Regression via Additivity and Variance Stabilization

ORCID Icon, ORCID Icon & ORCID Icon
Pages 85-100 | Received 13 Oct 2021, Accepted 05 Apr 2023, Published online: 26 May 2023
 

ABSTRACT

Outliers can have a major effect on the estimated transformation of the response in linear regression models, as they can on the estimates of the coefficients of the fitted model. The effect is more extreme in the Generalized Additive Models (GAMs) that are the subject of this article, as the forms of terms in the model can also be affected. We develop, describe and illustrate robust methods for the nonparametric transformation of the response and estimation of the terms in the model. Numerical integration is used to calculate the estimated variance stabilizing transformation. Robust regression provides outlier free input to the polynomial smoothers used in the calculation of the response transformation and in the backfitting algorithm for estimation of the functions of the GAM. Our starting point was the AVAS (Additivity and VAriance Stabilization) algorithm of Tibshirani. Even if robustness is not required, we have made four further general optional improvements to AVAS which greatly improve the performance of Tibshirani’s original Fortran program.

We provide a publicly available and fully documented interactive program for our procedure which is a robust form of Tibshirani’s AVAS that allows many forms of robust regression. We illustrate the efficacy of our procedure through data analyses. A refinement of the backfitting algorithm has interesting implications for robust model selection. Supplementary materials for this article are available online.

Supplementary Materials

The first three sections of the supplementary material provide flow charts for the program we wrote to implement RAVAS. Section 2 is the initialisation of the fitting procedure including implementation of the four non-iterative options; Section 3 describes the iterative part of the algorithm and sets up the environment for the numerical variance stabilizing transformation; Section 4 provides the fitted values and residuals to which the trapezoidal integration approximation is applied; Section 5 provides links, inter alia, to the code used for the calculations and the plotting of graphs.

Acknowledgments

We are very grateful to the editor, associate editor and a referee, whose comments greatly helped us to clarify the presentation of our work. Our research has benefited from the High Performance Computing (HPC) facility of the University of Parma. We acknowledge financial support from the University of Parma project “Robust statistical methods for the detection of frauds and anomalies in complex and heterogeneous data.”

Disclosure Statement

The authors confirm there are no competing interests to be declared.

Additional information

Funding

We acknowledge financial support from the University of Parma project “Robust statistical methods for the detection of frauds and anomalies in complex and heterogeneous data.”