Prediction of Disease Severity Using Machine Learning Algorithms: An Analysis for Chronic Kidney Disease in the US
Speaker(s)
Verma V1, Rastogi M2, Bharti S2, Pandey S2, Sanyal S2, Bansal V2, Gaur A2, Daral S2, Kukreja I2, Nayyar A2, Roy A2, Khan S1
1Optum, Gurgaon, HR, India, 2Optum, Gurugram, HR, India
Presentation Documents
OBJECTIVES: Chronic kidney disease (CKD) is a growing concern in the US, with approximately 14% of the adults having kidney ailments. Since lack of proper healthcare facilities and inadequate awareness lead to a progression of the disease, it is of ardent need to identify the level of kidney dysfunction and undertake prompt treatment actions.
In this study, we evaluate Machine Learning (ML) algorithms to predict CKD severity for appropriate and timely intervention.METHODS: Optum® de-identified Market Clarity database was used for this study, where Electronic Health Record (EHR) data was mainly exploited to accrue patients’ underlying comorbidities and laboratory test results, encompassing 30 predictors. A total of 130,000 patients, aged 45 years and older were considered from 2015-2021, with CKD patients categorized as mild (3.7%). moderate (13.8%) and severe (0.7%). Supervised ML techniques which include Logistic regression, Random Forest, Gradient Boosting, Neural Network and XG Boost were used to anticipate the disease stages.
RESULTS: The accuracy of disease severity was evaluated for each of the models to obtain the most precise results. XG Boost provided the best accuracy of 73% (AUC: 0.8 and F1 score: 0.4).
Laboratory test values which contributed to high feature importance were serum creatinine (11.46%), triglyceride (5.42%), potassium (4.81%), total protein (4.25%) and aspartate aminotransferase (4.04%). Age was also found to significantly affect CKD progression, with a feature importance of 5.27%.CONCLUSIONS: The vast repository of clinical data in the EHR database is effectively utilized in our study for appropriate prediction of CKD progression. This model can be implemented as a decision support tool to minimize prescription errors and assist the providers with timely and accurate prognosis, enabling an efficient care delivery for CKD patients.
Code
RWD185
Topic
Real World Data & Information Systems
Topic Subcategory
Data Protection, Integrity, & Quality Assurance, Distributed Data & Research Networks, Health & Insurance Records Systems, Reproducibility & Replicability
Disease
Urinary/Kidney Disorders