Drifting in the Network: Catching Baseline Shifts Before They Wreck Your Network Meta-Analysis
Author(s)
Saswata Paul Choudhury, MSc, Kalpesh Chatterjee, MSc, Sekhar Kumar Dutta, MSc, Abhirup Dutta Majumdar, MSc.
PharmaQuant Insights Private Limited, Kolkata, India.
PharmaQuant Insights Private Limited, Kolkata, India.
OBJECTIVES: In network meta-analysis (NMA), the transitivity assumption requires that included studies be sufficiently comparable with respect to key effect modifiers, such as baseline patient characteristics (BCxs). However, heterogeneity in BCxs across studies may indicate distributional drift, similar to covariate shift in machine learning (ML), potentially biasing indirect comparisons. To address this, we propose a clustering-based approach to detect such drift, thereby enhancing the feasibility and credibility of NMA.
METHODS: We analyzed a simulated dataset of 15 studies, each reporting 8 BCx variables, using K-means clustering. The elbow method identified the optimal number of clusters by evaluating the within-cluster sum of squares (WCSS) across different values of k. Clustering was performed using the Hartigan and Wong (1979) algorithm to minimize intra-cluster variability. We then applied relative importance analysis (RIA) to quantify each feature’s contribution to cluster separation, analogous to feature attribution methods in machine learning. To quantify the distributional drift between clusters, Jensen-Shannon Divergence (JSD) was computed for the marginal distribution of each BCx variable across clusters. JSD provided a quantitative measure of heterogeneity between clusters.
RESULTS: The elbow method identified k = 2 as the optimal clustering solution. K-means partitioned the studies into Cluster 1 (n = 12) and Cluster 2 (n = 3). Both RIA and JSD metrics confirmed substantial drift in these variables between clusters, revealing that the primary contributors to between-cluster variability were X1 (34%), X5 (25%), and X6 (13%) variables. This suggests that drift in these BCxs may influence comparability across studies.
CONCLUSIONS: Drift in BCxs across studies, if unaccounted for, may violate the transitivity assumption in NMA. A clustering-based approach enables the detection of such drift patterns, helping identify more homogeneous subsets of studies. Statisticians may leverage these clusters to perform stratified or sensitivity analyses, improving the robustness of treatment effect estimates in the presence of distributional shift.
METHODS: We analyzed a simulated dataset of 15 studies, each reporting 8 BCx variables, using K-means clustering. The elbow method identified the optimal number of clusters by evaluating the within-cluster sum of squares (WCSS) across different values of k. Clustering was performed using the Hartigan and Wong (1979) algorithm to minimize intra-cluster variability. We then applied relative importance analysis (RIA) to quantify each feature’s contribution to cluster separation, analogous to feature attribution methods in machine learning. To quantify the distributional drift between clusters, Jensen-Shannon Divergence (JSD) was computed for the marginal distribution of each BCx variable across clusters. JSD provided a quantitative measure of heterogeneity between clusters.
RESULTS: The elbow method identified k = 2 as the optimal clustering solution. K-means partitioned the studies into Cluster 1 (n = 12) and Cluster 2 (n = 3). Both RIA and JSD metrics confirmed substantial drift in these variables between clusters, revealing that the primary contributors to between-cluster variability were X1 (34%), X5 (25%), and X6 (13%) variables. This suggests that drift in these BCxs may influence comparability across studies.
CONCLUSIONS: Drift in BCxs across studies, if unaccounted for, may violate the transitivity assumption in NMA. A clustering-based approach enables the detection of such drift patterns, helping identify more homogeneous subsets of studies. Statisticians may leverage these clusters to perform stratified or sensitivity analyses, improving the robustness of treatment effect estimates in the presence of distributional shift.
Conference/Value in Health Info
2025-11, ISPOR Europe 2025, Glasgow, Scotland
Value in Health, Volume 28, Issue S2
Code
MSR78
Topic
Methodological & Statistical Research, Study Approaches
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
No Additional Disease & Conditions/Specialized Treatment Areas