Augmenting Expertise: A Classifier Algorithm's Ability to Identify and Categorize Health Economics and Outcomes Research (HEOR) Publications
Speaker(s)
Morland R1, Cairns L1, Bell J1, McBurnie E1, Epworth A2, Adams R2, Huang Y3, van Den Broek R4
1Excerpta Medica, London, London, UK, 2DistillerSR, Ottawa, ON, Canada, 3Excerpta Medica, Amstelveen, Netherlands, 4Adelphi Communications, Amstelveen, North Holland, Netherlands
Presentation Documents
OBJECTIVES: Reviewing literature takes time for all stakeholders, especially those drafting systematic literature reviews (SLR). Accurate inclusion of Health Economics and Outcomes Research (HEOR) can be challenging, as this is a broad descriptive term. Machine-learning classifiers could assist by identifying HEOR publications for inclusion. The aim of this study was to train a classifier to identify HEOR publications based on HEOR expert (HEOR-E) input, run the classifier against HEOR publications, and compare the classifier performance against non-HEOR experienced researchers (HEOR-N).
METHODS: DistillerSR software was populated with HEOR and non-HEOR publications from two clinical indications, excluding reviews and publications without an abstract. Randomly-selected publications (‘TRAIN’) were dual-reviewed by six HEOR-E (“Does the title/abstract report HEOR data?”, defined as humanistic, economic, or comparative-effectiveness outcomes). Conflicts were resolved via online/verbal discussion; the classifier was trained with unconflicted data. In parallel, five HEOR-N single-reviewed publications from a second set (‘TEST'). The trained classifier was applied to TEST, and conflicts between classifier and reviewer were examined. Descriptive statistics are reported.
RESULTS: In TRAIN (n=245), 52 conflicts were resolved versus 122 conflicts in TEST (n=551). Most conflicts regarded classification of reviews, epidemiology, or real-world safety studies as ‘HEOR’ or ‘non-HEOR’. Trained classifier metrics were: balanced accuracy, 0.55; recall, 0.72; and F1, 0.75. HEOR-N were correct for 61% of decisions, and the classifier was correct for 39%. The classifier excluded fewer records (144 vs 234), indicating a lower false-negative rate.
CONCLUSIONS: Expert-trained classifiers can identify publications reporting HEOR outcomes using the abstract/title, and improve the efficiency and reliability of SLR development to support decision-making. Low false-negatives are desirable during literature screening for SLR to facilitate completeness. However, subject area expertise is still crucial for identifying humanistic studies. Future discussions should refine the classification and subtyping system that defines HEOR publications to improve the accurate identification of clinical versus HEOR publications.
Code
SA73
Topic
Methodological & Statistical Research, Study Approaches
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics, Literature Review & Synthesis
Disease
No Additional Disease & Conditions/Specialized Treatment Areas