329
Views
2
CrossRef citations to date
0
Altmetric
ORIGINAL RESEARCH

A Novel Chronic Kidney Disease Phenotyping Algorithm Using Combined Electronic Health Record and Claims Data

, , , ORCID Icon, , , & show all
Pages 299-307 | Received 10 Nov 2022, Accepted 16 Feb 2023, Published online: 08 Mar 2023
 

Abstract

Purpose

Because chronic kidney disease (CKD) is often under-coded as a diagnosis in claims data, we aimed to develop claims-based prediction models for CKD phenotypes determined by laboratory results in electronic health records (EHRs).

Patients and Methods

We linked EHR from two networks (used as training and validation cohorts, respectively) with Medicare claims data. The study cohort included individuals ≥65 years with a valid serum creatinine result in the EHR from 2007 to 2017, excluding those with end-stage kidney disease or on dialysis. We used LASSO regression to select among 134 predictors for predicting continuous estimated glomerular filtration rate (eGFR). We assessed the model performance when predicting eGFR categories of <60, <45, <30 mL/min/1.73m2 in terms of area under the receiver operating curves (AUC).

Results

The model training cohort included 117,476 patients (mean age 74.8 years, female 58.2%) and the validation cohort included 56,744 patients (mean age 73.8 years, female 59.6%). In the validation cohort, the AUC of the primary model (with 113 predictors and an adjusted R2 of 0.35) for predicting eGFR <60, eGFR<45, and eGFR <30 mL/min/1.73m2 categories was 0.81, 0.88, and 0.92, respectively, and the corresponding positive predictive values for these 3 phenotypes were 0.80 (95% confidence interval: 0.79, 0.81), 0.79 (0.75, 0.84), and 0.38 (0.30, 0.45), respectively.

Conclusion

We developed a claims-based model to determine clinical phenotypes of CKD stages defined by eGFR values. Researchers without access to laboratory results can use the model-predicted phenotypes as a proxy clinical endpoint or confounder and to enhance subgroup effect assessment.

Data Sharing Statement

Data supporting the results reported in this manuscript contain detailed, patient-level clinical information and therefore cannot be made available publicly to protect patient privacy. The data accessed in this study comply with all relevant data protection and privacy regulations.

Disclosure

The authors declare no conflict of interests for this work.

Additional information

Funding

This study was funded by the National Institutes of Health (1RF1AG063381-01 and R01LM013204). The funder had no role in the design, collection, analysis, interpretation of the data, or the decision to submit the manuscript for publication.