REPORTING OF STUDY DETAILS IN ABSTRACTS: INFORMATIVE FOR ARTIFICIAL INTELLIGENCE (AI) OR UNDERWHELMING?

Author(s)

Allie Cichewicz, MSc¹, Marius Sauca, BSc, MSc², Kevin Kallmes, BS, MA, JD³;
¹Nested Knowledge, Boston, MA, USA, ²Nested Knowledge, UTRECHT, Netherlands, ³Nested Knowledge, St. Paul, MN, USA

Presentation Documents

ISPOR26_Cichewicz_MSR177_POSTER.pdf

OBJECTIVES: Advancements in AI, particularly large language models, streamline the initial review phase to help researchers quickly identify relevant studies. However, accuracy is limited by study information described in abstracts. Checklists like CONSORT-A, PRISMA-A, and STARD have helped standardize reporting, but differences still exist between authors, journals, and study types. We aimed to synthesize evidence on reporting frequencies of critical study details in abstracts to identify trends and gaps to inform screening methods leveraging AI.
METHODS: A comprehensive, living review was undertaken to identify studies that evaluate the prevalence of key concepts commonly used to determine abstract-level eligibility for literature reviews: Study type, data source(s), study registration, patient population, treatment(s), sample size, and outcomes.
RESULTS: As of December 2025, 47 studies were included covering 37,177 abstracts, predominantly from randomized controlled trials (RCTs) (10,132 abstracts [27.3%]; n=33 studies), systematic reviews (742[2.0%];n=6), observational (650[1.8%];n=2), diagnostic accuracy (616[1.7%];n=4), RCT+observational (130[0.4%];n=1), and all study types (24,907[67.0%];n=1). Across all study types, intervention/treatment (88%) and disease/condition (86%) were consistently well-reported, while participant eligibility (60%), effectiveness outcomes (62%), and sample size (58%) showed moderate reporting; safety outcomes (38%), data source/setting (38%), and registration (27%) were poorly reported. Notably, diagnostic accuracy studies had strong sample size reporting (78%) but the poorest eligibility (26%) and registration (2%) details. Systematic reviews had strong study type identification (89%) but weak registration (6%), and RCTs showed particularly poor data source/setting reporting (32%) with highly variable study registration (1-99%).
CONCLUSIONS: The abstract-reporting evidence base is heavily skewed toward RCTs (70%), with limited representation of other study types. Most assessments used reporting guidelines (e.g., CONSORT-A) based on rigorous methodological requirements; this may overestimate gaps for AI-assisted screening, which may be able to assess basic concept presence. Future assessments focused on PICO-based concept presence rather than reporting quality may provide more actionable insights for AI-based screening prompts.

Conference/Value in Health Info

2026-05, ISPOR 2026, Philadelphia, PA, USA

Value in Health, Volume 29, Issue S6

Code

MSR177

Topic

Methodological & Statistical Research

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics

Disease

No Additional Disease & Conditions/Specialized Treatment Areas

Presentation (CTI)