Leveraging NLP to Characterize Real-World Triptan Use and Adherence From Unstructured EHR
Author(s)
Sunny Guin, PhD1, Katie Brown, PhD2, Sarah Platt, MS2, Agnes Pastwa, BS2, Emily Webber, PhD2.
1Director of Research Analytics, Truveta, Seattle, WA, USA, 2Truveta, Seattle, WA, USA.
1Director of Research Analytics, Truveta, Seattle, WA, USA, 2Truveta, Seattle, WA, USA.
OBJECTIVES: Migraines represent a significant burden across health systems, with high direct and indirect costs due to underdiagnosis and inexact treatments. Triptans, while established as a standard acute therapy, exhibit variable real-world use and patient adherence. Traditional structured electronic health record (EHR) data often miss contextually rich information needed to understand treatment behavior at scale. A natural language processing(NLP) model was developed to understand triptan use and adherence from unstructured clinical notes.
METHODS: Truveta data provides complete, timely, representative, de-identified EHR data comprising over 120 million patients and 7 billion clinical notes from U.S. health systems. Clinical notes associated with migraine or headache diagnoses and referencing triptan medications were selected. The Truveta Language Model (TLM) extracted medication mentions, associated treatment attributes, and adherence-related concepts. Extracted concepts were mapped to standard clinical ontologies using a zero-shot ontology normalization framework. Model performance was assessed using precision, recall, and accuracy, benchmarked against expert annotations.
RESULTS: From an initial population of ~1.2 million patients with a migraine diagnosis, ~310,000 met inclusion criteria, representing over 1 million qualifying clinical notes. TLM achieved a precision of 86.8%, recall of 86.1%, and accuracy of 76.1%. Iterative training improved performance on lower-frequency concepts (e.g., medication overuse headaches). The identified population was predominantly female (82.4%) and aged 30-49, aligning with known epidemiology. Notable disparities were observed in older and racially diverse populations, underscoring potential gaps in documentation and treatment access. Among patients prescribed triptan, 26% discontinued treatment with ineffective results (55%) and side effects (28%) cited by clinicians.
CONCLUSIONS: This study highlights the value of applying advanced NLP to unstructured clinical data for real-world adherence insights. By enabling large-scale, high-fidelity capture of treatment behaviors, TLM offers new opportunities to inform health policy, optimize migraine care, and improve patient outcomes. This scalable ontology-aligned methodology supports applications in pharmacovigilance, patient segmentation, and population health management.
METHODS: Truveta data provides complete, timely, representative, de-identified EHR data comprising over 120 million patients and 7 billion clinical notes from U.S. health systems. Clinical notes associated with migraine or headache diagnoses and referencing triptan medications were selected. The Truveta Language Model (TLM) extracted medication mentions, associated treatment attributes, and adherence-related concepts. Extracted concepts were mapped to standard clinical ontologies using a zero-shot ontology normalization framework. Model performance was assessed using precision, recall, and accuracy, benchmarked against expert annotations.
RESULTS: From an initial population of ~1.2 million patients with a migraine diagnosis, ~310,000 met inclusion criteria, representing over 1 million qualifying clinical notes. TLM achieved a precision of 86.8%, recall of 86.1%, and accuracy of 76.1%. Iterative training improved performance on lower-frequency concepts (e.g., medication overuse headaches). The identified population was predominantly female (82.4%) and aged 30-49, aligning with known epidemiology. Notable disparities were observed in older and racially diverse populations, underscoring potential gaps in documentation and treatment access. Among patients prescribed triptan, 26% discontinued treatment with ineffective results (55%) and side effects (28%) cited by clinicians.
CONCLUSIONS: This study highlights the value of applying advanced NLP to unstructured clinical data for real-world adherence insights. By enabling large-scale, high-fidelity capture of treatment behaviors, TLM offers new opportunities to inform health policy, optimize migraine care, and improve patient outcomes. This scalable ontology-aligned methodology supports applications in pharmacovigilance, patient segmentation, and population health management.
Conference/Value in Health Info
2025-11, ISPOR Europe 2025, Glasgow, Scotland
Value in Health, Volume 28, Issue S2
Code
MSR137
Topic
Clinical Outcomes, Methodological & Statistical Research, Study Approaches
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
No Additional Disease & Conditions/Specialized Treatment Areas, Systemic Disorders/Conditions (Anesthesia, Auto-Immune Disorders (n.e.c.), Hematological Disorders (non-oncologic), Pain)