Approaches to Electronic Health Records Notes Selection: Considerations for Best Practices

Speaker(s)

Bui B1, Tkatch R2
1Optum, Irvine, CA, USA, 2Optum, Oak Park, MI, USA

Presentation Documents

OBJECTIVES: Electronic health records (EHR) are a unique data source that provide the ability to analyze real world patient data to obtain insight into the patient journey. Natural language processing (NLP) is a method of EHR notes analysis that transforms unstructured narrative data to structured data. This method allows researchers the ability to systematically analyze provider notes on a large scale. EHR notes are unstructured, heterogeneous, and idiosyncratic, with some notes rich in content and others sparse. One of the challenges prior to the analysis of EHR notes is optimal note selection that provide the necessary data. NLP methods are largely in the domain of clinical concepts extraction. A gap between concept extraction and concept understanding exists. This study examines NLP use and provides recommendations regarding note selection.

METHODS: A literature review on current NLP methodology will be conducted to characterize current uses, with the goal of providing recommendations for future advances. Articles will be screened for appropriateness, and themes studied and categorized. Current techniques will be discussed as well as existing challenges.

RESULTS: Current gaps in practice are 1) limited recognition of relationships between clinical concepts (such as treatment and outcome relationships) and (2) difficulties in extraction of temporal information to understand timing of clinical events and/or disease progression.

CONCLUSIONS: There is a need for improvements in EHR note search strategies. Manually reviewing EHRs as a note selection strategy is time-consuming and not generally feasible; the use of NLP screening allows for large scale note selection using keywords. However, many notes pulled may not have the rich content that researchers need. Optimal note selection methodology beyond keyword searches is imperative to assess treatment over time or to understand sequence of events, especially when EHR notes frequently lack sufficient detail regarding timing and duration of illness.

Code

SA13

Topic

Organizational Practices, Real World Data & Information Systems, Study Approaches

Topic Subcategory

Best Research Practices, Electronic Medical & Health Records, Health & Insurance Records Systems, Literature Review & Synthesis

Disease

No Additional Disease & Conditions/Specialized Treatment Areas