AI-ASSISTED APPROACHES FOR DEFINING LINES OF THERAPY IN ONCOLOGY REAL-WORLD DATA: AN EXPLORATORY ANALYSIS IN CHRONIC LYMPHOCYTIC LEUKEMIA
Author(s)
Eric Chen, PhD, Catherine Fu, MS, Keri Yang, PhD, MPH, MBA, MS, BSPharm.
BeOne Medicines USA, Inc., San Carlos, CA, USA.
BeOne Medicines USA, Inc., San Carlos, CA, USA.
OBJECTIVES: Accurate derivation of lines of therapy (LOT) is foundational for real-world oncology research, informing treatment pattern analyses, comparative effectiveness evaluations, and burden-of-illness studies. However, LOT determination remains inconsistent and highly variable in practice. This study aimed to examine AI-assisted and traditional manual LOT derivation approaches to understand how methodological LOT choices impact LOT results.
METHODS: An exploratory comparison of two LOT approaches was applied to 500 randomly selected chronic lymphocytic leukemia (CLL) patients from Symphony Health Integrated Dataverse®, a nationally representative US open-claims database. The first AI-assisted LOT approach used GPT-5.1, prompted with National Comprehensive Cancer Network (NCCN) guidelines and treatment regimen information to generate LOT definitions. The second approach utilized manually constructed NCCN guideline-based rules, and then further refined through manual expert review. LOT outputs from both approaches were compared to assess consistency in LOT and treatment regimen assignments. Potential sources of discrepancy were further evaluated.
RESULTS: Among 500 CLL patients evaluated, 404 (80.8%) demonstrated full LOT consistency across both approaches. Line-count mismatches occurred in 71 patients (14.2%), largely due to differences in detecting treatment-switch events, including 25 cases involving missing switch lines. Among patients with matching line counts, regimen-level discrepancies occurred in 25 patients (5.0%), comprising 38 mismatched line pairs. Common patterns reflected divergent interpretations of monotherapy versus combination regimens or missing regimen components. AI prompt design and input rule sets also further affect assignment consistency.
CONCLUSIONS: AI-assisted LOT derivation offers a scalable and transparent alternative to manual rule-based methods, substantially reducing time burden and enabling systematic testing of methodological assumptions. Nonetheless, this study underscores LOT outputs remain sensitive to operational definitions and require expert adjudication. Accurate performance depends on clear specification of AI prompts and foundational regimen information. Findings reinforce the need for standardized LOT frameworks to enhance reproducibility and comparability of oncology real-world data analyses.
METHODS: An exploratory comparison of two LOT approaches was applied to 500 randomly selected chronic lymphocytic leukemia (CLL) patients from Symphony Health Integrated Dataverse®, a nationally representative US open-claims database. The first AI-assisted LOT approach used GPT-5.1, prompted with National Comprehensive Cancer Network (NCCN) guidelines and treatment regimen information to generate LOT definitions. The second approach utilized manually constructed NCCN guideline-based rules, and then further refined through manual expert review. LOT outputs from both approaches were compared to assess consistency in LOT and treatment regimen assignments. Potential sources of discrepancy were further evaluated.
RESULTS: Among 500 CLL patients evaluated, 404 (80.8%) demonstrated full LOT consistency across both approaches. Line-count mismatches occurred in 71 patients (14.2%), largely due to differences in detecting treatment-switch events, including 25 cases involving missing switch lines. Among patients with matching line counts, regimen-level discrepancies occurred in 25 patients (5.0%), comprising 38 mismatched line pairs. Common patterns reflected divergent interpretations of monotherapy versus combination regimens or missing regimen components. AI prompt design and input rule sets also further affect assignment consistency.
CONCLUSIONS: AI-assisted LOT derivation offers a scalable and transparent alternative to manual rule-based methods, substantially reducing time burden and enabling systematic testing of methodological assumptions. Nonetheless, this study underscores LOT outputs remain sensitive to operational definitions and require expert adjudication. Accurate performance depends on clear specification of AI prompts and foundational regimen information. Findings reinforce the need for standardized LOT frameworks to enhance reproducibility and comparability of oncology real-world data analyses.
Conference/Value in Health Info
2026-05, ISPOR 2026, Philadelphia, PA, USA
Value in Health, Volume 29, Issue S6
Code
MSR218
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
No Additional Disease & Conditions/Specialized Treatment Areas, SDC: Oncology