Augmenting Expertise: A Classifier Algorithm's Ability to Identify and Categorize Health Economics and Outcomes Research (HEOR) Publications

Author(s)

Morland R¹, Cairns L¹, Bell J¹, McBurnie E¹, Epworth A², Adams R², Huang Y³, van Den Broek R⁴
¹Excerpta Medica, London, London, UK, ²DistillerSR, Ottawa, ON, Canada, ³Excerpta Medica, Amstelveen, Netherlands, ⁴Adelphi Communications, Amstelveen, North Holland, Netherlands

Presentation Documents

ISPOREurope24_Morland_SA73_POSTER146144.pdf

OBJECTIVES: Reviewing literature takes time for all stakeholders, especially those drafting systematic literature reviews (SLR). Accurate inclusion of Health Economics and Outcomes Research (HEOR) can be challenging, as this is a broad descriptive term. Machine-learning classifiers could assist by identifying HEOR publications for inclusion. The aim of this study was to train a classifier to identify HEOR publications based on HEOR expert (HEOR-E) input, run the classifier against HEOR publications, and compare the classifier performance against non-HEOR experienced researchers (HEOR-N).

METHODS: DistillerSR software was populated with HEOR and non-HEOR publications from two clinical indications, excluding reviews and publications without an abstract. Randomly-selected publications (‘TRAIN’) were dual-reviewed by six HEOR-E (“Does the title/abstract report HEOR data?”, defined as humanistic, economic, or comparative-effectiveness outcomes). Conflicts were resolved via online/verbal discussion; the classifier was trained with unconflicted data. In parallel, five HEOR-N single-reviewed publications from a second set (‘TEST'). The trained classifier was applied to TEST, and conflicts between classifier and reviewer were examined. Descriptive statistics are reported.

RESULTS: In TRAIN (n=245), 52 conflicts were resolved versus 122 conflicts in TEST (n=551). Most conflicts regarded classification of reviews, epidemiology, or real-world safety studies as ‘HEOR’ or ‘non-HEOR’. Trained classifier metrics were: balanced accuracy, 0.55; recall, 0.72; and F1, 0.75. HEOR-N were correct for 61% of decisions, and the classifier was correct for 39%. The classifier excluded fewer records (144 vs 234), indicating a lower false-negative rate.

CONCLUSIONS: Expert-trained classifiers can identify publications reporting HEOR outcomes using the abstract/title, and improve the efficiency and reliability of SLR development to support decision-making. Low false-negatives are desirable during literature screening for SLR to facilitate completeness. However, subject area expertise is still crucial for identifying humanistic studies. Future discussions should refine the classification and subtyping system that defines HEOR publications to improve the accurate identification of clinical versus HEOR publications.

Conference/Value in Health Info

2024-11, ISPOR Europe 2024, Barcelona, Spain

Value in Health, Volume 27, Issue 12, S2 (December 2024)

Code

SA73

Topic

Methodological & Statistical Research, Study Approaches

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Literature Review & Synthesis

Disease

No Additional Disease & Conditions/Specialized Treatment Areas

Explore Related HEOR by Topic

Methodology

Presentation