Electronic Health Records With Unstructured Text to Predict Outcomes With Machine Learning: A Therapeutic Area Fingerprint

Author(s)

Cossio M1, Gilardino R2
1Universitat de Barcelona, Dubendorf, ZH, Switzerland, 2HE-Xperts Consulting LLC, Miami, FL, USA

Presentation Documents

OBJECTIVES: Due to the exponential application of Machine Learning (ML) to predict outcomes by analyzing unstructured text from electronic health records (EHR), we assessed whether therapeutic areas or medical specialties keen to employ unstructured data to capture disease-related information.

METHODS: We searched PUBMED and Scholar Google using the criteria "Electronic Health Records" and 'Machine Learning, screening all publications in English until September 2021. Variables for analysis were: The number of patients and time of data collection by TA, type of data structure, automatic text analysis techniques, and clinical outcomes. Data are presented as means for continuous, and percentages for categorical data.

RESULTS: We selected 117 papers that included 18 different therapy areas; Cardiovascular (27/117, 23%), Psychiatry (19/117, 16,2%), and Oncology (14/117, 11,9 %) were among the top employed unstructured data in the EHRs. 5/117 (4,2%) represented data from EU+UK. The range of patient population was: 577 to 2.341.877, and the years of data capture ranged from 3 to 20.5. 78/1117 (67%) of these papers employed ICD as the principal coding language. 47/117 (40%) presented unstructured data, and of the 20 registered automatic text analysis techniques, cTAKES and MetaMAP were the most frequent.

CONCLUSIONS: A wide diversity of medical specialties covered was found. However, a lack of protocolization was observed in the standards of number of patients and duration of studies. Likewise, the lack of standardization for automatic text analysis was identified, since many poorly developed health centers store their clinical data in this only way.

Conference/Value in Health Info

2022-11, ISPOR Europe 2022, Vienna, Austria

Value in Health, Volume 25, Issue 12S (December 2022)

Code

RWD102

Topic

Epidemiology & Public Health, Methodological & Statistical Research

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Disease Classification & Coding

Disease

No Additional Disease & Conditions/Specialized Treatment Areas

Explore Related HEOR by Topic


Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×