Development of a Novel Approach to Improve Binary Classification Prediction Tasks When Encountering Time-To-Event Data: Performance Comparison of Three Alternative Approaches To Predict Stimulant Use Disorder Among Persons Authorized To Purchase...

Moderator

Allen M Smith, PharmD, University of Arkansas for Medical Sciences (UAMS), Little Rock, AR, United States

Speakers

Horacio Gomez-Acevedo; Corey J Hayes, MPH, PharmD, PhD; Melody Greer; Bradley Martin, RPh, PharmD, PhD, University of Arkansas for Medical Sciences (UAMS), Little Rock, AR, United States

OBJECTIVES: Compared to continuous-time survival modeling approaches, improved model discrimination has been demonstrated when utilizing a discrete-time framework, where time-to-event (TTE) data is converted into a person-period dataset that splits subject follow-up time into equally-spaced intervals. However, this framework may be improved by informing prediction in each time interval with updated features from prior intervals. A binary classification, continuous-time and discrete-time-updating approach were compared to predict stimulant use disorder (StUD) risk among Arkansas medical marijuana (MMJ) cardholders.
METHODS: Arkansas Statewide Health insurance claims data between November 2018 - December 2023 was utilized to construct a time-to-event dataset. For the binary classification approach, random forest (RF) and logistic regression (LR) models were trained using person-level dataset features and an event indicator (1: StUD|0: No StUD). The continuous-time approach included a continuous follow-up time variable in addition to the features and event indicator of the person-level dataset to train random survival forest (RSF) and Cox proportional hazards (CPH) models. For the discrete-time-updating approach, a person-period dataset with 90-day time intervals was constructed to train RSF, CPH, RF, and LR models, where features from each subject’s previous 6 months inform prediction of StUD for each time interval. Training and testing sets for all approaches were derived from a 70:30 random split. 1:100 random undersampling was performed on the person-period training set. Hyperparameter tuning with 10-fold cross validation was utilized for all approaches. Model performance was evaluated using c-statistic and the cumulative/dynamic area under the receiver-operating characteristic (C/D AUC-ROC) curve.
RESULTS: The discrete-time-updating approach achieved the highest discrimination for predicting a new StUD diagnosis (mean C/D AUC-ROC=LR:0.839|CPH:0.818|RSF:0.795|RF:0.762), compared to the continuous-time (mean C/D AUC-ROC=CPH:0.755|RSF:0.755) and binary classification (c-statistic=LR:0.734|RF:0.744) approaches.
CONCLUSIONS: The discrete-time-updating approach ingesting updated features outperformed more traditional classification approaches in predicting StUD and should be considered for other TTE classifications tasks.

Conference/Value in Health Info

2025-05, ISPOR 2025, Montréal, Quebec, CA

Value in Health, Volume 28, Issue S1

Code

MSR34

Topic

Methodological & Statistical Research

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics

Disease

SDC: Systemic Disorders/Conditions (Anesthesia, Auto-Immune Disorders (n.e.c.), Hematological Disorders (non-oncologic), Pain)

Presentation (CTI)