How much should we trust R2 and adjusted R2: evidence from regressions in top economics journals and Monte Carlo simulations: Journal of Applied Economics: Vol 26, No 1

2,366

Views

CrossRef citations to date

Altmetric

ABSTRACT

R² and adjusted R² may exaggerate a model’s true ability to predict the dependent variable in the presence of overfitting, whereas leave-one-out R² (LOOR²) is robust to overfitting. We demonstrate this by replicating 279 regressions from 100 papers in top economics journals, where the median increases of R² and adjusted R² over LOOR² reach 40.2% and 21.4% respectively. The inflation of test errors over training errors increases with the severity of overfitting as measured by the number of regressors and nonlinear terms, and the presence of outliers, but decreases with the sample size. These results are further validated by Monte Carlo simulations.

KEYWORDS:

Disclosure statement

No potential conflict of interest was reported by the authors.

Supplementary material

Supplemental data for this article can be accessed online at https://doi.org/10.1080/15140326.2023.2207326.

Notes

¹ To be sure, economics is not the only discipline in this regard. For example, Parady et al. (Citation2021) laments the overreliance on statistical goodness-of-fit and under-reliance on model validation in the transportation literature.

² The original formula for adjusted R² was first proposed in a paper by M. J. B. Ezekiel, who read it before the Mathematical Society at its annual meeting in 1928, but gave the credit to B. B. Smith.

³ We ignore the case of linear regression without a constant term, as it is rarely encountered in practice.

⁴ For example, the short-cut algorithm for computing LOOR² could be implemented in Stata by using the user-written command “cv_regress” (Rios-Avila, Citation2018) after the usual “regress” command for OLS regression.

⁵ These terminologies are in the same spirit as “variance inflation factor” (VIF).

⁶ In fact, the presence of many covariates also increases the complexity of regression function.

⁷ These four journals are selected partly because their replication data and programs are more easily accessible. See the Appendix for a complete list of these 100 papers.

⁸ Note that Dower et al. (Citation2021) only report R².

⁹ The results of using EIF or adjusted EIF as the dependent variables are qualitatively similar, but the fit is slightly worse. To save space, we only report results using Log(EIF) and Log(Adjusted EIF) as the dependent variables.:

¹⁰ Typically, the sample sizes of regressions within a paper change because of adding more variables, which may result in missing observations.

¹¹ As pointed out by an anonymous referee, adding nonlinear terms can be viewed as a particular case of including additional correlated covariates.

¹² We thank an anonymous referee for useful discussions about the relation between overfitting and parameter significance, and more studies are needed in this direction.

Additional information

Notes on contributors

Qiang Chen

Qiang Chen is a professor at the School of Economics, Shandong University.

Ji Qi

Ji Qi is a PhD student at the School of Economics, Shandong University.

How much should we trust R² and adjusted R²: evidence from regressions in top economics journals and Monte Carlo simulations

Notes on contributors

Qiang Chen

Ji Qi

Information for

Open access

Opportunities

Help and information

How much should we trust R2 and adjusted R2: evidence from regressions in top economics journals and Monte Carlo simulations

ABSTRACT

Disclosure statement

Supplementary material

Notes

Additional information

Notes on contributors

Qiang Chen

Ji Qi

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature

How much should we trust R² and adjusted R²: evidence from regressions in top economics journals and Monte Carlo simulations