Race and Ethnicity in Oracle EHR Real-World Data: A Contrast to Self-Reported Data in the United States
Speaker(s)
Rascon Velasco V1, Berliner E2, Jaffe DH3, Beckwith SM4, Furegato M5, Brignoli L1
1Oracle Life Sciences, Paris, 75, France, 2Oracle Life Sciences, Kansas City, MO, USA, 3Oracle Life Sciences, Jerusalem, Israel, Israel, 4Oracle Life Sciences, Austin, TX, USA, 5Oracle Life Sciences, Paris, France
Presentation Documents
OBJECTIVES: To contrast race and ethnicity data reported in Oracle EHR Real-World Data (OERWD) with a US self-reported survey.
METHODS: Data from OERWD, a de-identified, HIPAA-compliant dataset from 139 US health systems (<2023, n=105,832,841) was linked to data from the US National Health and Wellness Survey (NHWS) (2015-2022, n=292,391), a nationally-representative, self-reported, cross-sectional online survey. Individuals with invalid or inconsistent data for birth, encounter dates, race, or ethnicity within each dataset were removed. Race was categorized as American Indian/Alaskan Native, Asian, Black/African American, Native Hawaiian/Other Pacific Islander, White, Other, and Unknown. A Mixed category was added if multiple races were selected in NHWS. Ethnicity was reported as Non-Hispanic, Hispanic/Latino, and Unknown. Descriptive statistics were derived.
RESULTS: We identified 29,992 individuals, 64.7% were women with an overall mean age at last encounter of 46.2 years (SD=19.6). For race, 72.3% (n=21,683) of patients had concordant responses in both datasets, 9.5% (n=2,852) were discordant, and 18.2% (n=5,457) were Unknown in OERWD. The highest proportions of matched responses were observed among patients identifying as White (78.6%), Black/African American (74.0%) and Asian (51.1%). For all races, except Asian and White, when answers were discordant, most individuals were considered White in OERWD. Regarding ethnicity, 58.0% (n=17,396) of patients had concordant answers, 8.2% (n=2,470) were discordant, and 33.8% (n=10,126) were Unknown in OERWD. The proportion of matching answers increased with age at last encounter from 46.4% (0-14) to 86.0% (75+) for race and 10.3% to 81.5% for ethnicity.
CONCLUSIONS: Our study showed adequate concordance between EHR-recorded race and ethnicity and self-reported data indicating reliability of these variables when reported in EHR. Further research is needed to better understand reasons for underreporting.
Code
SA77
Topic
Epidemiology & Public Health, Study Approaches
Topic Subcategory
Electronic Medical & Health Records, Public Health
Disease
No Additional Disease & Conditions/Specialized Treatment Areas