Identifying Patient-Level Social Determinants of Health in Unstructured Clinical Notes From Electronic Health Records
Author(s)
Nikbakht M, Kumar V, Rasouliyan L
OMNY Health, Atlanta, GA, USA
OBJECTIVES: The objective of this study was to identify and quantify the occurrences of concepts related to social determinants of health (SDoH) at the patient level captured in unstructured clinical notes from electronic health records (EHRs).
METHODS: Comprehensive EHR-based clinical notes collected from an integrated delivery network on the OMNY Health Platform were included. Notes were searched for key terms related to the following SDoH constructs: income and social protection (ISP), education (EDU), unemployment and job insecurity (UJI), food insecurity (FI), housing and basic amenities (HBA), early childhood development (ECD), and access to affordable health services (AHS). Percentages of patients and notes with key terms indicated for each SDoH construct were computed.
RESULTS: Of approximately 20.8 million notes collected, 8.9 million (43%) from 541,050 unique patients were included after keeping only notes that were non-empty, clinical in nature, and at least 10 words in length. The patient population was 56% female, 73/22/5% White/Black/Other, 96% non-Hispanic, 44/25/24/6% <40/40-59/60-79/≥80 years old, and 32/24/21/23% employed/unemployed/retired/other. SDoH constructs with key terms mentioned in the clinical notes in descending order of patient frequency were as follows (% patients; % notes): UJI (22.0%; 2.3%), FI (13.5%; 5.2%), EDU (13.4%; 1.7%), ECD (13.1%; 2.0%), ISP (9.0%; 1.5%), AHS (6.0%; 0.8%), and HBA (0.5%; 0.1%).
CONCLUSIONS: SDoH concepts, which are important in many health economics and outcomes research (HEOR) questions, are not typically captured in structured EHR data and/or are available only on the aggregate level. Natural language processing of unstructured clinical notes from EHR data may provide a valuable opportunity to extract patient-level SDoH concepts for use in HEOR studies. Future work can utilize deep learning models such as transformers-based NLP models to perform SDoH classification and extraction from EHR data.
Conference/Value in Health Info
Value in Health, Volume 25, Issue 12S (December 2022)
Code
RWD64
Topic
Methodological & Statistical Research, Real World Data & Information Systems, Study Approaches
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics, Electronic Medical & Health Records, Health & Insurance Records Systems
Disease
No Additional Disease & Conditions/Specialized Treatment Areas