Validation Between Antidepressant Treatment Failure Proxy and PHQ-9 Score in Major Depressive Disorder Using a Linked Insurance Claims and Electronic Health Record Dataset

Author(s)

Rashmi Patel, MD PhD1, Carissa White Dukes, MPH2, Suzanne St Rose, PhD3, Anne Kilburg, MSc3, Sigurd D. Suessmuth, MD3, Franco De Crescenzo, MD DPhil3, Ling Zhang, MPH4.
1University of Cambridge, Cambridge, United Kingdom, 2Boehringer Ingelheim Pharmaceuticals, Inc., Greenwood, IN, USA, 3Boehringer Ingelheim International GmbH, Ingelheim am Rhein, Germany, 4Boehringer Ingelheim Pharmaceuticals, Inc., Ridgefield, CT, USA.
OBJECTIVES: Evaluating antidepressant response for major depressive disorder (MDD) in real-world data is challenging due to limited availability of structured symptom measures. This study aimed to assess whether antidepressant treatment failure can serve as a proxy for treatment non-response using Patient Health Questionnaire-9 (PHQ-9) data.
METHODS: Adults newly diagnosed with MDD and ≥1 claim for antidepressants between 01/01/2012 and 06/30/2022, with ≥2 PHQ-9 scores recorded, were identified from the Optum Market Clarity database. Treatment failure was defined as a switch or augmentation during a continuous treatment episode (linking prescriptions allowing ≤120-day gaps). PHQ-9 scores were obtained within 30 days of treatment episode initiation and end. Negative binomial regression estimated differences in PHQ-9 follow-up scores between treatment failure and non-failure groups, adjusting for baseline disease characteristics, with log of follow-up time as offset. Additionally, a <50% improvement or worsening of PHQ-9 score was evaluated as a measure of treatment non-response in logistic regression models.
RESULTS: Among 2,218 patients (mean age 42.2 years; 24.8% male), 1,741 (78.5%) experienced treatment failure. Treatment failure was associated with significantly higher PHQ-9 follow-up scores (score ratios = 1.27-1.75) across different subsets by baseline PHQ-9, with moderate effect sizes (Cohen’s d = 0.39-0.55) and Nagelkerke R² (a measure of model fit) between 0.16 and 0.28. Negative binomial regression models showed the highest accuracy (0.61; sensitivity: 0.60, specificity: 0.62) in severely (baseline PHQ-9 ≥15) ill patients. Logistic regression models on PHQ-9 score change revealed a maximum area under the curve of 0.76.
CONCLUSIONS: Treatment failure was associated with non-response but did not fully explain variance in follow-up PHQ-9 scores. The findings suggest that in the absence of data on treatment response, treatment failure may be partly correlated with non-response but also explained by other factors. Treatment failure could serve as severity measure to differentiate MDD subgroups.

Conference/Value in Health Info

2025-11, ISPOR Europe 2025, Glasgow, Scotland

Value in Health, Volume 28, Issue S2

Code

EPH277

Topic

Clinical Outcomes, Epidemiology & Public Health, Study Approaches

Disease

Mental Health (including addition), No Additional Disease & Conditions/Specialized Treatment Areas

Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×