AI-ASSISTED GENERATION OF SYNTHETIC PATIENT-LEVEL DATA TO RECREATE WEIGHT-CHANGE OUTCOMES FROM THE STEP-4 SEMAGLUTIDE TRIAL
Author(s)
Devendra Patil1, Hannah Vuong, PharmD Student1, Tewodros Eguale, PhD, MD1, Joanne Doucette, MS, MLIS2;
1Massachusetts College of Pharmacy and Health Sciences, Boston, MA, USA, 2MCPHS University, Boston, Boston, MA, USA
1Massachusetts College of Pharmacy and Health Sciences, Boston, MA, USA, 2MCPHS University, Boston, Boston, MA, USA
Presentation Documents
OBJECTIVES: To evaluate whether an AI model (ChatGPT) can generate realistic synthetic PLD and, after simulating dropout patterns, accurately reproduce the weekly weight-change trajectory and participant counts reported in the STEP-4 trial.
METHODS: The data was extracted from published STEP-4 manuscript, figures, and supplementary materials to generate a prompt for AI models. This prompt, along with figures, was provided to ChatGPT (GPT-5.1) to generate synthetic baseline weights and weekly weight-change values for the full run-in population. The resulting synthetic data was imported in Power BI to standardized and format variables and to create a calculated measure for percent weight change at each week. Because, the published STEP-4 used treatment-policy estimand using multiple imputations, rather than observed attendance, results in participant counts are non-monotonic. As raw observed weekly patients count (Ns) were not available, the Ns reported in footnotes of the STEP-4 weight-change figure (Figure C) were used for weekly targets for synthetic dropout. In Microsoft Excel, patients were randomly removed each week to match the reported Ns, with removed patients remaining inactive unless reactivation was need to reconcile imputation-driven increases. The dropout-adjusted synthetic dataset was analyzed in SAS 9.4.
RESULTS: The final output provides the weekly sample sizes printed under the STEP-4 figure for both semaglutide and placebo arms, confirming that dropout alignment was successfully achieved. The synthetic weight-change trajectory closely replicated the published curve and generated week-68 mean weight changes (-17.7% for semaglutide vs -5.0% for placebo) that closely match with the published treatment-policy estimand results (-17.4% vs -5.0%).
CONCLUSIONS: This analysis shows that LLMs can generate realistic synthetic patient-level data from published summary results and, with simulated dropout, closely replicate the weight-change trajectory and participant counts of the STEP-4 trial. This approach demonstrates the value of LLM-generated synthetic PLD for reproducibility, visualization, and exploratory analyses when actual patient-level data are unavailable.
METHODS: The data was extracted from published STEP-4 manuscript, figures, and supplementary materials to generate a prompt for AI models. This prompt, along with figures, was provided to ChatGPT (GPT-5.1) to generate synthetic baseline weights and weekly weight-change values for the full run-in population. The resulting synthetic data was imported in Power BI to standardized and format variables and to create a calculated measure for percent weight change at each week. Because, the published STEP-4 used treatment-policy estimand using multiple imputations, rather than observed attendance, results in participant counts are non-monotonic. As raw observed weekly patients count (Ns) were not available, the Ns reported in footnotes of the STEP-4 weight-change figure (Figure C) were used for weekly targets for synthetic dropout. In Microsoft Excel, patients were randomly removed each week to match the reported Ns, with removed patients remaining inactive unless reactivation was need to reconcile imputation-driven increases. The dropout-adjusted synthetic dataset was analyzed in SAS 9.4.
RESULTS: The final output provides the weekly sample sizes printed under the STEP-4 figure for both semaglutide and placebo arms, confirming that dropout alignment was successfully achieved. The synthetic weight-change trajectory closely replicated the published curve and generated week-68 mean weight changes (-17.7% for semaglutide vs -5.0% for placebo) that closely match with the published treatment-policy estimand results (-17.4% vs -5.0%).
CONCLUSIONS: This analysis shows that LLMs can generate realistic synthetic patient-level data from published summary results and, with simulated dropout, closely replicate the weight-change trajectory and participant counts of the STEP-4 trial. This approach demonstrates the value of LLM-generated synthetic PLD for reproducibility, visualization, and exploratory analyses when actual patient-level data are unavailable.
Conference/Value in Health Info
2026-05, ISPOR 2026, Philadelphia, PA, USA
Value in Health, Volume 29, Issue S6
Code
MSR50
Topic
Methodological & Statistical Research
Disease
SDC: Diabetes/Endocrine/Metabolic Disorders (including obesity)