COMPARATIVE EFFECTIVENESS OF ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING METHODS VERSUS TRADITIONAL DISPROPORTIONALITY ANALYSIS FOR ADVERSE DRUG REACTION SIGNAL DETECTION: A SYSTEMATIC REVIEW
Author(s)
Emeka E. Duru, BPharm1, Lotanna Ezeja, BPharm2, Azeez B. Aina, BPharm3, Fortune E. Olakunle, BPharm4;
1University of Utah, Salt Lake City, UT, USA, 2Auburn University, Harrison School of Pharmacy, Auburn, AL, USA, 3Purdue University, Indianapolis, IN, USA, 4Swipha Pharma Nig, Lagos, Nigeria
1University of Utah, Salt Lake City, UT, USA, 2Auburn University, Harrison School of Pharmacy, Auburn, AL, USA, 3Purdue University, Indianapolis, IN, USA, 4Swipha Pharma Nig, Lagos, Nigeria
OBJECTIVES: Post-marketing pharmacovigilance relies on statistical signal detection methods to identify potential adverse drug reactions (ADRs) in spontaneous reporting systems (SRS). While traditional disproportionality methods (proportional reporting ratio [PRR], reporting odds ratio [ROR], Bayesian Confidence Propagation Neural Network [BCPNN], Empirical Bayes Geometric Mean [EBGM]) remain standard practice, artificial intelligence (AI) and machine learning (ML) approaches have emerged as potential alternatives.
This systematic review aimed to compare the predictive performance of AI/ML methods versus traditional disproportionality analysis for safety signal detection in SRS databases
METHODS: A comprehensive literature search was conducted across PubMed, Embase, and Web of Science from database inception through December 2025. Studies were included if they applied AI/ML algorithms to SRS data, compared ML performance against traditional disproportionality methods, reported quantitative performance metrics, and used validated reference standards. Systematic screening was conducted using Distiller SR with pre-specified PICO criteria. Standardized extraction captured study characteristics, ML algorithms, traditional comparators, and performance metrics.
RESULTS: 12 studies met inclusion criteria, representing 4.1-65 million reports across FAERS (n=7), KAERS (n=2), KIDS-KD (n=1), French national database (n=1), and simulated data (n=1). ML approaches included gradient boosting (n=5), random forests (n=6), deep reinforcement learning (n=1), neural embeddings (n=1), and XGBoost (n=2). Nine studies (75%) demonstrated ML superiority over traditional methods. ML sensitivity ranged 43-100% versus traditional 18-75%; ML AUROC (0.52-1.0 vs 0.46-0.69). Best- performing approaches: gradient boosting machine (AUROC 0.97 vs. 0.55 for traditional), deep Q-network (+26% overall accuracy versus traditional), gradient boosting (4/5 adverse events detected in first year versus zero for traditional), neural embeddings (+14% AUROC improvement). Two studies found Bayesian methods or propensity score approaches comparable/superior due to data characteristics. Feature engineering beyond disproportionality enhanced ML performance. ML excelled at rare event/early detection.
CONCLUSIONS: AI/ML methods generally outperform traditional disproportionality analysis for safety signal detection, with advantages in sensitivity, early detection, and rare event identification.
This systematic review aimed to compare the predictive performance of AI/ML methods versus traditional disproportionality analysis for safety signal detection in SRS databases
METHODS: A comprehensive literature search was conducted across PubMed, Embase, and Web of Science from database inception through December 2025. Studies were included if they applied AI/ML algorithms to SRS data, compared ML performance against traditional disproportionality methods, reported quantitative performance metrics, and used validated reference standards. Systematic screening was conducted using Distiller SR with pre-specified PICO criteria. Standardized extraction captured study characteristics, ML algorithms, traditional comparators, and performance metrics.
RESULTS: 12 studies met inclusion criteria, representing 4.1-65 million reports across FAERS (n=7), KAERS (n=2), KIDS-KD (n=1), French national database (n=1), and simulated data (n=1). ML approaches included gradient boosting (n=5), random forests (n=6), deep reinforcement learning (n=1), neural embeddings (n=1), and XGBoost (n=2). Nine studies (75%) demonstrated ML superiority over traditional methods. ML sensitivity ranged 43-100% versus traditional 18-75%; ML AUROC (0.52-1.0 vs 0.46-0.69). Best- performing approaches: gradient boosting machine (AUROC 0.97 vs. 0.55 for traditional), deep Q-network (+26% overall accuracy versus traditional), gradient boosting (4/5 adverse events detected in first year versus zero for traditional), neural embeddings (+14% AUROC improvement). Two studies found Bayesian methods or propensity score approaches comparable/superior due to data characteristics. Feature engineering beyond disproportionality enhanced ML performance. ML excelled at rare event/early detection.
CONCLUSIONS: AI/ML methods generally outperform traditional disproportionality analysis for safety signal detection, with advantages in sensitivity, early detection, and rare event identification.
Conference/Value in Health Info
2026-05, ISPOR 2026, Philadelphia, PA, USA
Value in Health, Volume 29, Issue S6
Code
MSR132
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
No Additional Disease & Conditions/Specialized Treatment Areas