Validation Between Antidepressant Treatment Failure Proxy and PHQ-9 Score in Major Depressive Disorder Using a Linked Insurance Claims and Electronic Health Record Dataset
Author(s)
Rashmi Patel, MD PhD1, Carissa White Dukes, MPH2, Suzanne St Rose, PhD3, Anne Kilburg, MSc3, Sigurd D. Suessmuth, MD3, Franco De Crescenzo, MD DPhil3, Ling Zhang, MPH4.
1University of Cambridge, Cambridge, United Kingdom, 2Boehringer Ingelheim Pharmaceuticals, Inc., Greenwood, IN, USA, 3Boehringer Ingelheim International GmbH, Ingelheim am Rhein, Germany, 4Boehringer Ingelheim Pharmaceuticals, Inc., Ridgefield, CT, USA.
1University of Cambridge, Cambridge, United Kingdom, 2Boehringer Ingelheim Pharmaceuticals, Inc., Greenwood, IN, USA, 3Boehringer Ingelheim International GmbH, Ingelheim am Rhein, Germany, 4Boehringer Ingelheim Pharmaceuticals, Inc., Ridgefield, CT, USA.
OBJECTIVES: Evaluating antidepressant response for major depressive disorder (MDD) in real-world data is challenging due to limited availability of structured symptom measures. This study aimed to assess whether antidepressant treatment failure can serve as a proxy for treatment non-response using Patient Health Questionnaire-9 (PHQ-9) data.
METHODS: Adults newly diagnosed with MDD and ≥1 claim for antidepressants between 01/01/2012 and 06/30/2022, with ≥2 PHQ-9 scores recorded, were identified from the Optum Market Clarity database. Treatment failure was defined as a switch or augmentation during a continuous treatment episode (linking prescriptions allowing ≤120-day gaps). PHQ-9 scores were obtained within 30 days of treatment episode initiation and end. Negative binomial regression estimated differences in PHQ-9 follow-up scores between treatment failure and non-failure groups, adjusting for baseline disease characteristics, with log of follow-up time as offset. Additionally, a <50% improvement or worsening of PHQ-9 score was evaluated as a measure of treatment non-response in logistic regression models.
RESULTS: Among 2,218 patients (mean age 42.2 years; 24.8% male), 1,741 (78.5%) experienced treatment failure. Treatment failure was associated with significantly higher PHQ-9 follow-up scores (score ratios = 1.27-1.75) across different subsets by baseline PHQ-9, with moderate effect sizes (Cohen’s d = 0.39-0.55) and Nagelkerke R² (a measure of model fit) between 0.16 and 0.28. Negative binomial regression models showed the highest accuracy (0.61; sensitivity: 0.60, specificity: 0.62) in severely (baseline PHQ-9 ≥15) ill patients. Logistic regression models on PHQ-9 score change revealed a maximum area under the curve of 0.76.
CONCLUSIONS: Treatment failure was associated with non-response but did not fully explain variance in follow-up PHQ-9 scores. The findings suggest that in the absence of data on treatment response, treatment failure may be partly correlated with non-response but also explained by other factors. Treatment failure could serve as severity measure to differentiate MDD subgroups.
METHODS: Adults newly diagnosed with MDD and ≥1 claim for antidepressants between 01/01/2012 and 06/30/2022, with ≥2 PHQ-9 scores recorded, were identified from the Optum Market Clarity database. Treatment failure was defined as a switch or augmentation during a continuous treatment episode (linking prescriptions allowing ≤120-day gaps). PHQ-9 scores were obtained within 30 days of treatment episode initiation and end. Negative binomial regression estimated differences in PHQ-9 follow-up scores between treatment failure and non-failure groups, adjusting for baseline disease characteristics, with log of follow-up time as offset. Additionally, a <50% improvement or worsening of PHQ-9 score was evaluated as a measure of treatment non-response in logistic regression models.
RESULTS: Among 2,218 patients (mean age 42.2 years; 24.8% male), 1,741 (78.5%) experienced treatment failure. Treatment failure was associated with significantly higher PHQ-9 follow-up scores (score ratios = 1.27-1.75) across different subsets by baseline PHQ-9, with moderate effect sizes (Cohen’s d = 0.39-0.55) and Nagelkerke R² (a measure of model fit) between 0.16 and 0.28. Negative binomial regression models showed the highest accuracy (0.61; sensitivity: 0.60, specificity: 0.62) in severely (baseline PHQ-9 ≥15) ill patients. Logistic regression models on PHQ-9 score change revealed a maximum area under the curve of 0.76.
CONCLUSIONS: Treatment failure was associated with non-response but did not fully explain variance in follow-up PHQ-9 scores. The findings suggest that in the absence of data on treatment response, treatment failure may be partly correlated with non-response but also explained by other factors. Treatment failure could serve as severity measure to differentiate MDD subgroups.
Conference/Value in Health Info
2025-11, ISPOR Europe 2025, Glasgow, Scotland
Value in Health, Volume 28, Issue S2
Code
EPH277
Topic
Clinical Outcomes, Epidemiology & Public Health, Study Approaches
Disease
Mental Health (including addition), No Additional Disease & Conditions/Specialized Treatment Areas