AUTOMATED STUDY FEASIBILITY USING AGENTIC ARTIFICIAL INTELLIGENCE TO RAPIDLY IDENTIFY FIT-FOR-PURPOSE SECONDARY DATA

Author(s)

Nicola Sawalhi-Leckenby, MSc¹, Sophie E A Graham, PhD¹, Dimitra Lambrelli, MASc, MSc, PhD¹, Mireia Raluy Callado, MSc², Ashwin Kumar Rai, MS³, Mark Yates, BSc, PhD, MD¹.
¹Thermo Fisher Scientific, London, United Kingdom, ²Thermo Fisher Scientific, Stockholm, Sweden, ³Thermo Fisher Scientific, Overland Park, KS, USA.

Presentation Documents

Sawalhi-Leckenby et al_RWD35_ISPOR US 2026.pdf

OBJECTIVES: Identifying appropriate secondary data sources for real-world evidence generation studies is increasingly challenging as research questions demand granular clinical, temporal, and biomarker data. ISPOR task force guidance emphasizes transparent, reproducible evaluation of data source suitability; however, feasibility assessments often rely on manual processes and fragmented institutional knowledge. Scalable approaches that operationalize these principles across diverse data sources remain limited.
METHODS: A metadata-driven feasibility framework was developed, combining a standardized data catalogue with an AI agent-based decision-support interface. Data source catalogue development followed a reproducible workflow using a unified Data Element Grid (DEG) template, harmonized from internal knowledge and EMA catalogue metadata. For initial development, 100 claims and EHR databases were prioritized based on current use. DEGs were curated by subject-matter experts using a structured extraction process, with independent review by a second knowledgeable reviewer. The AI agent-based interface is being designed to interpret study requirements, align them with catalogue metadata, and generate standardized, transparent recommendations for data source suitability. The interface front end will allow non expert users to enter study requirements in free text form, which are then cross checked against catalogued metadata to present data recommendations to the user.
RESULTS: The framework includes information on 100 databases, with consistent terminology and structure to facilitate access and interpretation by a broad spectrum of users. Pilot evaluations demonstrate that the metadata feasibility framework will offer substantial efficiency gains, reduced manual searching, improved repeatability, and fewer errors compared with traditional approaches.
CONCLUSIONS: Combining a harmonized metadata foundation with an AI driven decision-support interface can operationalize ISPOR principles for data source selection at scale and within shorter timeframes than manual assessments. This approach has the potential to support greater transparency, reproducibility, and confidence in selecting complex real-world data sources. Formal impact evaluations are ongoing to quantify time savings and output quality.

Conference/Value in Health Info

2026-05, ISPOR 2026, Philadelphia, PA, USA

Value in Health, Volume 29, Issue S6

Code

RWD35

Topic

Real World Data & Information Systems

Topic Subcategory

Reproducibility & Replicability

Disease

No Additional Disease & Conditions/Specialized Treatment Areas

Presentation (CTI)