Personalized Treatment Optimization Using Reinforcement Learning on Real-World Data (RWD)

Author(s)

Evangelos Vagianos, MSc, Angeliki Revelou, MSc, Nikolaos Kountouris, MSc.
R&D-RWD, Pfizer, Athens, Greece.
OBJECTIVES: This study aimed to evaluate the effectiveness of a Reinforcement Learning (RL) model in optimizing treatment strategies for type 2 diabetes. The goal was to determine whether RL-based recommendations align with established clinical guidelines and to assess the model’s potential to support adaptive clinical decision-making using synthetic real-world data (RWD).
METHODS: A simulation environment was developed using synthetic patient-level data representing diverse characteristics and disease progression patterns in type 2 diabetes. The RL model (Advantage Actor-Critic, A2C) was trained to optimize treatment decisions over time, using clinical variables such as HbA1c, body mass index (BMI), and cardiovascular risk. The action space included common antidiabetic treatments (e.g., Metformin, SGLT-2 inhibitors, Insulin). Treatment sequences were evaluated using a cumulative reward system prioritizing glycemic control and long-term outcomes. A user interface was created to visualize patient trajectories and model recommendations.
RESULTS: The RL model’s recommendations were consistent with clinical guidelines: Metformin was most frequently recommended as first-line therapy, while insulin was suggested only in later stages when HbA1c remained above target. In scenarios with well-controlled diabetes (HbA1c <6.5%), the model occasionally recommended no pharmacologic treatment. The A2C algorithm outperformed alternative approaches in achieving sustained glycemic control and minimizing therapy changes.
CONCLUSIONS: The RL model demonstrated alignment with guideline-based care and showed potential to personalize treatment for type 2 diabetes. While developed using synthetic data, this approach offers a scalable foundation for clinical decision support systems trained on real-world data, with implications for improving patient outcomes and reducing unnecessary treatment costs.

Conference/Value in Health Info

2025-11, ISPOR Europe 2025, Glasgow, Scotland

Value in Health, Volume 28, Issue S2

Code

MSR166

Topic

Methodological & Statistical Research, Patient-Centered Research, Real World Data & Information Systems

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics

Disease

Diabetes/Endocrine/Metabolic Disorders (including obesity), No Additional Disease & Conditions/Specialized Treatment Areas

Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×