ABSTRACT
Traffic crashes vary in the manner in which the collision occurs (collision type), and countermeasures to reduce crashes might vary significantly based on this collision type. The inherent complexity in their mechanism has motivated this study to identify significant factors influencing collision types, with the goal of better countermeasure deployment. The objective of this work is to compare the performances of statistical and machine learning (ML) models in classifying crashes based on collision type, and assess their generalizability and interpretability. Discrete choice models, Bayesian classifiers, tree-based algorithms, and support vector machines are among the data-driven methods considered for comparison. Results indicate that tree-based algorithms perform consistently well and offer a higher interpretability, with out-of-distribution robustness. However, while ML models provide a flexible framework for modeling large data volumes, statistical models provide additional interpretability on the effect of critical variables on crash mechanisms – which is relevant from a safety management standpoint.
Acknowledgments
The authors would like to thank the Pennsylvania Department of Transportation for providing the data used in this analysis. The authors would also like to thank the graduate students at Penn State who added to this database in the summer of 2018.
Disclosure statement
No potential conflict of interest was reported by the authors.
Disclaimer
The contents of this paper reflect the views of the authors who are responsible for the facts and accuracy of the data presented herein. The contents do not necessarily reflect the official views or policies of the Federal Highway Administration or the Commonwealth of Pennsylvania at the time of publication. This paper does not constitute a standard, specification or regulation.