Using the Health Improvement Distribution Index to Inform Equitable Machine Learning: An Example Analysis Among Patients with Type 2 Diabetes Mellitus
Author(s)
Carpenito T, Graf L, Gulla J, Munsell M
Panalgo, Boston, MA, USA
Presentation Documents
OBJECTIVES: The objective of this study was to train a machine learning model that predicts type 2 diabetes mellitus (T2DM)-related hospitalizations and evaluate its performance among protected classes where equal access to care could lower disparities, as indicated by a value >1 on ICER’s Health Improvement Distribution Index (HIDI) for T2DM (i.e., Hispanic, Asian, and Black patients).
METHODS: Patients ≤80 years of age with ≥1 inpatient or ≥2 outpatient diagnoses of T2DM (first deemed index) were identified from 6/30/2021-6/30/2023 in a U.S. dataset that includes administrative and claims data for over 170 million patients across commercial payors, Medicare Advantage and Medicaid. All diagnoses and procedures from a 12-month baseline period were included as features in an XGBoost model predicting T2DM-related hospitalization six months post-index. Race was not used in model training per standard ethical practice. Using a 25% test sample and prediction threshold >5.48% (training sample prevalence of hospitalization), false negatives produced by the model were identified and stratified by race.
RESULTS: 2,353,083 patients met eligibility criteria (30.81% Caucasian, 14.51% Hispanic, 12.87% Black, 6.01% Asian, 35.8% Other/Unknown). The best performing model identified T2DM-related hospitalizations with an AUC of 74.65%. True hospitalizations were missed at a lower than average rate for Black patients (25.70% vs. 30.05%) and at a higher rate for Asian and Hispanic patients (49.75% and 36.25%, respectively). For Asian/Hispanic patients, top predictors indicative of preventative care were less likely to be observed, such as receiving an electrocardiogram during baseline.
CONCLUSIONS: While Asian and Hispanic patients have a relatively high prevalence of T2DM, they had low representation in administrative claims and fewer baseline procedures that contribute to producing an accurate hospitalization risk score. Benchmarks such as the HIDI can be used by researchers to better understand the potential for risk algorithms to exacerbate existing health inequalities for protected classes.
Conference/Value in Health Info
Value in Health, Volume 27, Issue 6, S1 (June 2024)
Acceptance Code
P10
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
diabetes-endocrine-metabolic-disorders-including-obesity, no-additional-disease-conditions-specialized-treatment-areas