PERFORMANCE OF ADAPTIVE SMART TAGS IN NESTED KNOWLEDGE FOR AUTOMATED EXTRACTION OF STUDY CHARACTERISTICS

Author(s)

Priccila Zuchinali, PhD1, Sophie Yoon, MPH2, Joanna Kamar, MPH1, Amber Martin, BSc2;
1Thermo Fisher Scientific, Montreal, QC, Canada, 2Thermo Fisher Scientific, Waltham, MA, USA
OBJECTIVES: Nested Knowledge (NK) is an evidence synthesis platform used in systematic literature reviews (SLRs). This study evaluated the accuracy of Adaptive Smart Tags (ASTs) in NK for extracting study characteristics.
METHODS: An SLR of randomized trials evaluating efficacy and safety of treatments for refractory chronic cough was conducted. Study characteristics were extracted using ASTs, which leverage OpenAI large language models within NK to identify relevant text or numeric values for predefined data elements. AST development involved (1) creating question-based prompts (“tags”) for each data element and organizing them hierarchically to convey concepts and relationships to the model, which searched full-text publications for optimal responses, and (2) applying a human-in-the-loop approach in which two full-text articles were used to pilot and refine prompts prior to deployment across all studies.
RESULTS: AST performance was evaluated across 18 study characteristics in 37 studies. Accuracy was highest for bibliographic and core design elements, with correct information and formatting achieved for publication type (n=35; 95%), trial registration number (n=35; 95%), comorbidities (n=34; 92%), and interventions (n=29; 78%). Availability of key outcomes was frequently captured (range: 30-33 studies) for most of the outcomes of interest, including cough severity, 24-hour cough frequency, urge-to-cough, Leicester Cough Questionnaire scores, and other patient-reported outcomes. Performance declined for complex or inconsistently reported elements. Subgroup reporting showed poor accuracy, with incorrect information identified in 32 studies. Formatting errors were common for comparator identification (n=30; 81%), timepoints assessed (n=20; 54%), and trial name (n=15; 41%).
CONCLUSIONS: Within NK, AI-driven ASTs showed good performance for structured, consistently reported study characteristics but were less reliable for nuanced, complex, or variably reported data elements. A human-in-the-loop workflow remains essential to ensure accuracy, particularly for subgroup data and detailed outcome specifications.

Conference/Value in Health Info

2026-05, ISPOR 2026, Philadelphia, PA, USA

Value in Health, Volume 29, Issue S6

Code

MSR189

Topic

Methodological & Statistical Research

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics

Disease

No Additional Disease & Conditions/Specialized Treatment Areas

Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×