AI Models for Predicting Clinical Trial Success: Capabilities and Risks of Pattern-Driven Approaches

Author(s)

Ruth Bartelli Grigolon, PhD¹, JULIA LIMA, MSc¹, Otavio Clark, PhD², Elise Berliner, PhD³, Renato Mantelli Picoli, PhD⁴.
¹Oracle Life Sciences, São Paulo, Brazil, ²Oracle Life Sciences, New York, NY, USA, ³Oracle Life Science, Austin, TX, USA, ⁴Oracle Life Sciences, Monte Azul Paulista, Brazil.

OBJECTIVES: This study investigates the use of artificial intelligence (AI) to predict clinical trial outcomes, focusing on two key approaches: HINT (Hierarchical Interaction Network for Clinical Trial Outcome Predictions) and TrialBench (Multi-Modal AI-Ready Clinical Trial Datasets). Both aim to improve drug development efficiency by forecasting trial success, patient dropout, adverse events, and dosing decisions using multimodal data.
METHODS: HINT employs a hierarchical neural network that integrates drug structures, disease phenotypes, and trial design criteria. TrialBench compiles 23 standardized datasets from ClinicalTrials.gov and other sources. Both platforms use deep learning models that combine textual, tabular, and ontological data.
RESULTS: TrialBench showed high performance in tasks like dropout (F1 > 0.95) and SAE prediction (~0.93) but underperformed in dosing predictions, trial approval classifications, and failure cause identification (F1 < 0.50). HINT presented mixed results: F1 for outcome prediction were 0.66 in Phase I, 0.62 in Phase II, and 0.84 in Phase III, with precision varying by disease (ranging from 0.58 for neoplasms to 0.86 for respiratory diseases). Despite promising results, both models face important challenges. They offer limited interpretability, complicating clinical and regulatory adoption, and their data sources are often incomplete, inconsistently labeled, or biased. Annotations from generative models may introduce further uncertainty. Both approaches mainly target small-molecule drugs, limiting applicability to biologics, vaccines, and devices. They overlook critical social, operational, and contextual factors that can affect trial outcomes. Performance across tasks remains uneven, especially in classifying trial failure causes.
CONCLUSIONS: The models show future potential for predicting clinical trial outcomes but have important limitations. Their reliance on historical data may reinforce "me-too" drug development, focusing on well-known therapeutic areas and standard dosing. Databases like ClinicalTrials.gov overrepresent Phase I-III trials, Western regions, and successful outcomes, creating feedback loops that amplify biases and underrepresentation. Finally, the models lack robust validation.

Conference/Value in Health Info

2025-11, ISPOR Europe 2025, Glasgow, Scotland

Value in Health, Volume 28, Issue S2

Code

MSR19

Topic

Clinical Outcomes, Methodological & Statistical Research, Study Approaches

Disease

No Additional Disease & Conditions/Specialized Treatment Areas

Presentation (CTI)

Author(s)

Conference/Value in Health Info

Code

Topic

Disease

ISPOR–The Professional Society for
Health Economics and Outcomes Research

Your browser is out-of-date