A METHODOLOGICAL COMPARISON OF THREE LANDMARKING APPROACHES FOR TIME-TO-EVENT PREDICTION: PREDICTING DIAGNOSED CANNABIS USE DISORDER AMONG MEDICAL MARIJUANA CARDHOLDERS IN ARKANSAS
Author(s)
Allen M. Smith, PharmD1, Bradley C. Martin, RPh, PharmD, PhD2, Horacio Gomez-Acevedo, PhD3, Corey J. Hayes, MPH, PharmD, PhD2, Melody Greer, PhD3, Chenghui Li, PhD3;
1University of Arkansas for Medical Sciences (UAMS), Post Doctoral Fellow, Little Rock, AR, USA, 2University of Arkansas for Medical Sciences (UAMS), Little Rock, AR, USA, 3University of Arkansas for Medical Sciences, Little Rock, AR, USA
1University of Arkansas for Medical Sciences (UAMS), Post Doctoral Fellow, Little Rock, AR, USA, 2University of Arkansas for Medical Sciences (UAMS), Little Rock, AR, USA, 3University of Arkansas for Medical Sciences, Little Rock, AR, USA
OBJECTIVES: Landmarking is a time-to-event modeling framework where risk predictions are updated at predefined landmark times over fixed windows. Traditional landmarking strategies include landmark supermodeling, which pools data from multiple landmark times into a single extended dataset, and strict landmarking, which fits separate models at each landmark time. The current study additionally evaluates cumulative landmarking, a hybrid approach that progressively expands the training dataset while fitting separate models at successive landmarks. This study compares predictive performance across these landmarking strategies.
METHODS: Arkansas statewide administrative health claims data from November 2018-December 2023 were used to construct discrete-time datasets for new cannabis use disorder diagnosis prediction among medical marijuana (MMJ) cardholders. Five classifiers were evaluated with each landmarking strategy: Random Survival Forest, Support Vector Machine Survival (SVMS), Cox proportional hazards, Random Forest, and Logistic Regression. Models were trained using a 50:50 train-test split and 1:25 random undersampling. Performance was compared using 90-day, 180-day, and 360-day prediction horizons using mean cumulative sensitivity/dynamic specificity area under the receiver operating characteristic (AUC-ROC) and inverse probability of censoring weighting Brier scores (IPCWBS) with 95% confidence intervals (CIs). Both discrete-time and discrete-time-daily frameworks were evaluated for landmark supermodels, with the latter recording follow-up time on a daily scale while maintaining interval-based prediction updates, thereby more precisely representing partial follow-up within the final prediction interval.
RESULTS: Among 54,422 MMJ cardholders, SVMS landmark supermodel trained with 360-day prediction windows achieved the highest discriminative performance [AUC-ROC (95% CI)=0.7993(0.7751,0.8206)] outperforming strict [AUC-ROC (95% CI)=0.6684(0.6087,0.7238)] and cumulative landmarking [AUC-ROC (95% CI)=0.7390(0.7072,0.7680)]. Discriminative performance was comparable between discrete-time and discrete-time-daily frameworks, and calibration was modestly better for discrete-time-daily landmark supermodels (IPCWBS point-estimate range: 0.00385-0.00735) compared to discrete-time (IPCWBS point-estimate range: 0.00465-0.02139).
CONCLUSIONS: Landmark supermodeling was the most discriminating among landmarking strategies. Incorporating a discrete-time-daily representation of follow-up time yielded minor calibration improvements, particularly for longer horizons.
METHODS: Arkansas statewide administrative health claims data from November 2018-December 2023 were used to construct discrete-time datasets for new cannabis use disorder diagnosis prediction among medical marijuana (MMJ) cardholders. Five classifiers were evaluated with each landmarking strategy: Random Survival Forest, Support Vector Machine Survival (SVMS), Cox proportional hazards, Random Forest, and Logistic Regression. Models were trained using a 50:50 train-test split and 1:25 random undersampling. Performance was compared using 90-day, 180-day, and 360-day prediction horizons using mean cumulative sensitivity/dynamic specificity area under the receiver operating characteristic (AUC-ROC) and inverse probability of censoring weighting Brier scores (IPCWBS) with 95% confidence intervals (CIs). Both discrete-time and discrete-time-daily frameworks were evaluated for landmark supermodels, with the latter recording follow-up time on a daily scale while maintaining interval-based prediction updates, thereby more precisely representing partial follow-up within the final prediction interval.
RESULTS: Among 54,422 MMJ cardholders, SVMS landmark supermodel trained with 360-day prediction windows achieved the highest discriminative performance [AUC-ROC (95% CI)=0.7993(0.7751,0.8206)] outperforming strict [AUC-ROC (95% CI)=0.6684(0.6087,0.7238)] and cumulative landmarking [AUC-ROC (95% CI)=0.7390(0.7072,0.7680)]. Discriminative performance was comparable between discrete-time and discrete-time-daily frameworks, and calibration was modestly better for discrete-time-daily landmark supermodels (IPCWBS point-estimate range: 0.00385-0.00735) compared to discrete-time (IPCWBS point-estimate range: 0.00465-0.02139).
CONCLUSIONS: Landmark supermodeling was the most discriminating among landmarking strategies. Incorporating a discrete-time-daily representation of follow-up time yielded minor calibration improvements, particularly for longer horizons.
Conference/Value in Health Info
2026-05, ISPOR 2026, Philadelphia, PA, USA
Value in Health, Volume 29, Issue S6
Code
MSR105
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics