Ways To Minimize Data Quality Headaches in Migraine Research

Speaker(s)

Chandran U1, Wolfe T2, Chen C1, Riskin D1
1Verantos, Menlo Park, CA, USA, 2Verantos, Decatur, GA, USA

Presentation Documents

OBJECTIVES: The accurate identification of clinical features and symptoms pertinent to migraine is key to discern patient subgroups for effective and targeted interventions. However, migraine phenotypes may not be reliably characterized in structured electronic health records (EHR) or administrative claims due to either missing specific codes or limited capture within billing pathways. Using advanced technology and unstructured EHR data, this study assessed accuracy and feasibility of extracting migraine-related features and symptoms that may be largely absent in traditional structured real-world data (RWD).

METHODS: Data from 2,750 EHR encounters were obtained from a US-based integrated delivery health system. A total of 18 pre-specified migraine-related concepts (including subtypes and symptoms) were annotated. A manual reference standard was created with two annotators reviewing each chart. Inter-rater reliability was measured by Cohen’s kappa score. The advanced approach involved training deep learned models to extract concepts from unstructured data. Recall, precision, and F1-score were calculated for all features with at least 20 patient-experienced occurrences in the reference standard.

RESULTS: Accuracy measures were calculated for dizziness, photophobia, fatigue, headache, migraine, nausea, phonophobia, vertigo, migraine with aura, aura, migrainous vertigo, transformed migraine, and retinal migraine. The average recall, precision, and F1-score across all evaluable concepts were 91%, 93%, and 92%, respectively with the advanced approach compared to the reference standard. Inter-related reliability was >0.80 reflecting a credible reference standard. Concepts (allodynia, hemiplegic migraine, menstrual migraine, episodic migraine, and basilar migraine) with experienced occurrences in <20 encounters were not assessed.

CONCLUSIONS: This study demonstrated the feasibility of measuring data accuracy in RWD and the ability to identify with high accuracy migraine subtypes and symptoms using unstructured data and advanced technology. Being able to capture clinically important migraine features and phenotypes in RWD paves the way to generation of scientifically robust real-world evidence in migraine research.

Code

RWD30

Topic

Epidemiology & Public Health, Study Approaches

Topic Subcategory

Disease Classification & Coding, Electronic Medical & Health Records

Disease

Neurological Disorders, No Additional Disease & Conditions/Specialized Treatment Areas