Using Biased Proxy As Outcomes for Prediction Models: Are We (Re)Producing Health Inequalities?

Author(s)

Khor S1, Basu A1, Hahn EE2, Haupt EC2, Lyons LJ2, Henry CP1, Shankaran V3, Heagerty PJ1, Bansal A1
1University of Washington, Seattle, WA, USA, 2Southern California Permanente Medical Group, Pasadena, CA, USA, 3Fred Hutch, Seattle, WA, USA

OBJECTIVES:

It is common practice to capture health outcomes in electronic health record data using proxy based on healthcare utilization because detailed chart review is often not feasible. However, since healthcare utilization often varies systematically across racial subgroups, these proxy outcomes can be biased. We aimed to examine the racial bias in a utilization-based proxy for colorectal cancer (CRC) recurrence when the goal is to predict risk of recurrence to inform care decisions.

METHODS:

The study cohort consisted of adults with CRC who underwent resection in a large integrated healthcare system. CRC recurrence was identified using validated healthcare utilization algorithms. We compared this utilization-based recurrence against the chart-reviewed gold-standard recurrence status and assessed the positive and negative predictive values (PPV; NPV) by racial/ethnic subgroups.

RESULTS:

Among 175 patients (mean age 64;48% female), 51% were non-Hispanic White (NHW), 25% Hispanic, 11% Black/African American (AA), 11% Asian/Pacific Islander. The utilization-based proxy for recurrence had good overall performance (PPV=85%,NPV=95%). However, we observed higher false positive rates (FPR) among racial/ethnic minorities, which translated to lower PPVs of 73% in Hispanics and 82% in AAs, vs. 90% NHWs. The higher FPR also inflated the 5-year cumulative incidence of recurrence estimated using the utilization-based outcome in the Hispanic and AA subgroups compared to the gold standard (Hispanic 57%vs.41%; AA 55%vs.43%, compared to NHW 37%vs.37%). NPVs were similar across groups (NHW 95%; Hispanic 95%; AA 100%). Detailed chart review revealed that higher FPR among racial minority may be partially due to delayed primary treatment being misclassified as recurrence.

CONCLUSIONS:

The utilization-based proxy inflated the recurrence rates for the racial/ethnic minority subgroups. Using biased proxy outcome to generate risk prediction models to inform care decisions may incorrectly estimate the health needs for Hispanic/Black patients, resulting in inappropriate healthcare recommendations that further perpetuate healthcare disparities.

Conference/Value in Health Info

2022-05, ISPOR 2022, Washington, DC, USA

Value in Health, Volume 25, Issue 6, S1 (June 2022)

Code

MSR27

Topic

Health Policy & Regulatory, Methodological & Statistical Research, Study Approaches

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Electronic Medical & Health Records, Health Disparities & Equity

Disease

No Additional Disease & Conditions/Specialized Treatment Areas

Explore Related HEOR by Topic


Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×