Replicability and Deployment of a Systemic Lupus Erythematosus Flare Prediction Model From Administrative Claims to Electronic Medical Records

Author(s)

Min J¹, Stein E², Igho-Osagie E¹, Evans L², Warnick J², Doole J², Chan KA², Liu J¹, Wang E³
¹Merck & Co., Inc., Rahway, NJ, USA, ²TriNetX, LLC, Cambridge, MA, USA, ³Merck & Co., Inc., Boston, MA, USA

Presentation Documents

wanhaoyi_W5986245_2024_ISPOR_Euro-ReplicabilityDeploy_Poster_v1.03 final142364.pdf

OBJECTIVES: Systemic lupus erythematosus (SLE) flares are associated with higher morbidity and mortality, yet flares are not codified in administrative healthcare databases. Algorithms to identify flaring patients have been published, but understanding of their external validity and transportability is limited; this study aims to validate, improve and deploy a published algorithm.

METHODS: This study used the TriNetX Dataworks-USA de-identified electronic medical records (EMR) database from 60 healthcare organizations across the US. Adult patients newly diagnosed with SLE between January 1, 2018 and October 11, 2023 with at least one historical encounter prior to diagnosis were included. A validation dataset was created by identifying 151 patients with flares and patients without flares using medical chart reviews. The remaining de-identified EMR data not selected for chart review was randomly divided into 80%/20% training/testing datasets.

The published algorithm (Goetz, 2022) using 10 predictors and proxy SLEDAI-2k scores as outcome was replicated. To improve model performance, regression models of all possible combinations of the 10 predictors were trained and tested. Models with the best performance predicting the SLEDAI-2k and containing clinically relevant predictors were selected for validation.

RESULTS: Overall, 31,666 patients met the inclusion criteria for this study. The replicated algorithm did not yield high performance (Brier score: 0.91, C-statistic: 0.54). Seven models with improved performance selected for validation yielded positive predictive value (PPV) (0.63-0.70), sensitivity (0.44-0.72), specificity (0.47-0.76), and C-statistic (0.59-0.60). The best performing model selected based on highest PPV (0.70) had sensitivity of 0.55, specificity of 0.65, and C-statistic of 0.60.

CONCLUSIONS: Our models had moderate predictive performance for identifying SLE flares in EMR data. This may be because the published claims-based algorithm was not fully replicable and transportable in our de-identified EMR data. While challenges remain, establishing external validity and replicability for predictive models are critical for real-world applications.

Conference/Value in Health Info

2024-11, ISPOR Europe 2024, Barcelona, Spain

Value in Health, Volume 27, Issue 12, S2 (December 2024)

Code

MSR151

Topic

Clinical Outcomes, Methodological & Statistical Research, Real World Data & Information Systems, Study Approaches

Topic Subcategory

Clinical Outcomes Assessment, Electronic Medical & Health Records, Reproducibility & Replicability

Disease

No Additional Disease & Conditions/Specialized Treatment Areas, Systemic Disorders/Conditions (Anesthesia, Auto-Immune Disorders (n.e.c.), Hematological Disorders (non-oncologic), Pain)

Explore Related HEOR by Topic

Presentation