Racial and Ethnic Fairness in Risk Prediction Models for Cardiovascular Diseases Among U.S. Adults

Author(s)

ABSTRACT WITHDRAWN

OBJECTIVES: The racial/ethnic fairness of the risk prediction models has aroused researchers’ attention when employing clinical machine learning models. This research explores the tradeoff of different scenarios and classification algorithms for constructing predictive cardiovascular disease (CVD) models between the model performance and racial/ethnic fairness.

METHODS: Using individual data from the National Health and Nutrition Examination Surveys from 2007 to 2018, we compared the general models and race-specific models in six scenarios: (1) race-neutral (RN), excluding race/ethnicity as a predictor; (2) race-sensitive (RS), including race/ethnicity as a predictor; (3) RN with race/ethnicity stratified when data cross-validation; (4) RN with CVD stratified; (5) RS with CVD stratified; (6) RN with CVD and race/ethnicity stratified. Each was examined with eight algorithms. Racial/ethnic fairness was measured by the range in the C-statistic (AUC, areas under the curve) across race/ethnicity groups.

RESULTS: We included 4942 patients (41.4% non-Hispanic White, 27% Hispanic, 22.3% non-Hispanic Black/African American, and 9.3% non-Hispanic Other Races). Among 48 combinations, the gradient descent algorithm in scenarios 1 and 5 showed the best fairness, but both race-specific models performed worst in their scenarios. However, the AUC for the general model had acceptable performance in these cases (AUC>0.7). In all race-specific models, Hispanics had the worst performance. Comparing race-neutral and race-sensitive scenarios, adding race as a predictor or race stratification did not necessarily lead to better performance or fairness. Logistic regression, Naive Bayes, and Elastic Network algorithms presented acceptable discriminative ability and fairness in all scenarios.

CONCLUSIONS: In the CVD prediction study of racial/ethnic fairness, the different race/ethnicity treatment strategies in the model building could lead to a diverse selection of the optimal set of scenarios and algorithms. Clinical prediction model development should consider fairness when training models in different scenarios with other considerations, such as imbalanced data, to eliminate the potential model-generated health inequalities.

Conference/Value in Health Info

2024-05, ISPOR 2024, Atlanta, GA, USA

Value in Health, Volume 27, Issue 6, S1 (June 2024)

Acceptance Code

P12

Topic

Health Policy & Regulatory, Methodological & Statistical Research

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Health Disparities & Equity

Disease

cardiovascular-disorders-including-mi-stroke-circulatory, no-additional-disease-conditions-specialized-treatment-areas

Explore Related HEOR by Topic

Presentation (Paper)