Development and Validation of a Machine Learning Algorithm to Identify Patients with Atypical Hemolytic Uremic Syndrome from US Claims Data
Author(s)
Lyons G1, Ong ML1, Bandaru R1, Gangaraju R2, Samuel J3, Wang Y1
1Alexion, AstraZeneca Rare Disease, Boston, MA, USA, 2Division of Hematology and Oncology, Department of Medicine, The University of Alabama at Birmingham, Birmingham, AL, USA, 3Renal Associates of Baton Rouge, Baton Rouge, LA, USA
Presentation Documents
OBJECTIVES: Before 2022, there was no specific International Classification of Diseases (ICD) code for atypical hemolytic uremic syndrome (aHUS), making it difficult to use claims data to study aHUS outcomes. This study aimed to develop and validate an algorithm to identify patients with aHUS using an insurance claims database.
METHODS: We used linked electronic health record and claims data (Optum’s de-identified Market Clarity Data; from Jan-2016 to Jan-2022) of patients with chronic kidney disease according to the following inclusion criteria: ≥2 hemolytic uremic syndrome diagnosis codes (ICD-10, D59.3) or ≥2 thrombotic microangiopathy diagnosis codes (ICD-10, M31.1), data available from ≥2 physician notes and no evidence of other complement-mediated diseases. Ground-truth aHUS was established using tokenized physician notes from the database. Previously, a rule-based algorithm was established. However, due to a lack of laboratory results to enable differential diagnosis, the algorithm did not provide a clear differential among TMA types. Three machine learning (ML) algorithms (elastic net, classification tree, and random forest) were developed using clinical features, including renal, pulmonary, cardiovascular, hematology and others, extracted from patient medical records. Overall, 70% of the data was used to train the models and 30% was reserved for testing.
RESULTS: Of the 1,992 patients included, 304 (15%) had ground-truth aHUS. In the test set, the elastic net ML algorithm had the highest positive predictive value (85%) for identifying aHUS with 73% sensitivity and 98% specificity. The most important features that the ML algorithm used to identify patients with aHUS were hospitalization, renal disease progression, cardiovascular/pulmonary complications, and the absence of bacterial infections.
CONCLUSIONS:
This ML algorithm approach may address the limitations of rule-based algorithms and has the potential to permit accurate identification of more patients with aHUS in claims data and provide an improved understanding of the aHUS treatment landscape.Conference/Value in Health Info
Value in Health, Volume 26, Issue 6, S2 (June 2023)
Code
MSR98
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
Urinary/Kidney Disorders