2,023
Views
7
CrossRef citations to date
0
Altmetric
Articles

Dimensions of uncertainty: a spatiotemporal review of five COVID-19 datasets

ORCID Icon, ORCID Icon, ORCID Icon, , ORCID Icon & ORCID Icon
Pages 200-221 | Received 28 Apr 2021, Accepted 29 Aug 2021, Published online: 25 Oct 2021
 

ABSTRACT

COVID-19 surveillance across the United States is essential to tracking and mitigating the pandemic, but data representing cases and deaths may be impacted by attribute, spatial, and temporal uncertainties. COVID-19 case and death data are essential to understanding the pandemic and serve as key inputs for prediction models that inform policy-decisions; consistent information across datasets is critical to ensuring coherent findings. We implement an exploratory data analytic approach to characterize, synthesize, and visualize spatial-temporal dimensions of uncertainty across commonly used datasets for case and death metrics (Johns Hopkins University, the New York Times, USAFacts, and 1Point3Acres). We scrutinize data consistency to assess where and when disagreements occur, potentially indicating underlying uncertainty. We observe differences in cumulative case and death rates to highlight discrepancies and identify spatial patterns. Data are assessed using pairwise agreement (Cohen’s kappa) and agreement across all datasets (Fleiss’ kappa) to summarize changes over time. Findings suggest highest agreements between CDC, JHU, and NYT datasets. We find nine discrete type-components of information uncertainty for COVID-19 datasets reflecting various complex processes. Understanding processes and indicators of uncertainty in COVID-19 data reporting is especially relevant to public health professionals and policymakers to accurately understand and communicate information about the pandemic.

Acknowledgments

This research was made possible by the open source and open access efforts of the New York Times, Johns Hopkins University, and continuing public access of data from the CDC and USAFacts. Thanks to 1Point3Acres for continued data use permissions.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Supplementary material

Supplemental data for this article can be accessed here.

Data availability and code availability statement

Data used at time of writing and code notebook for this analysis are available at https://github.com/geoda/covid-uncertainty. An interactive version of key figures in this paper are available at https://observablehq.com/@uscovidatlas/data-uncertainty-national-us-covid-data. For further exploration of individual data differences for particular counties over time, please explore the interactive code notebook available at https://colab.research.google.com/drive/1iRKtRsNf-tBYJN6Jx_0bYgQpVY0p3onj.

Additional information

Funding

The US Covid Atlas project is funded in part by the Robert Wood Johnson Foundation. This research was also supported by the National Institutes of Health through the NIH HEAL Initiative under award number U2CDA050098.