Clinical Relevance of a Machine Learning Model for Automated Analyses of Depression Severity: The ePHQ-9 in Treatment-Resistant Depression
Author(s)
Pedro Alves, BS1, Carl D. Marci, MD1, Joseph W. Zabinski, PhD, MEM1, Michael Batech, DrPH, MPH2, Costas Boussios, PhD1.
1OM1, Inc., Boston, MA, USA, 2OM1, Inc., Frankfurt, Germany.
1OM1, Inc., Boston, MA, USA, 2OM1, Inc., Frankfurt, Germany.
OBJECTIVES: The PHQ-9, a validated measure for depressive symptom severity, is inconsistently documented in real-world data (RWD). This limits RWD studies on diagnosis, treatment response, and the patient journey. A previous AI effort successfully estimated PHQ-9 scores (ePHQ-9) from clinical notes with strong analytic performance. This study assessed the association between observed PHQ-9 and ePHQ-9 scores and physician-attested treatment-resistant depression (TRD), a known phenomenon in the literature, to further validate the ePHQ-9’s utility.
METHODS: A large US real-world dataset, including claims and electronic medical records with clinical notes, was used to identify major depression patients. A subcohort with physician-attested TRD was labeled via text analysis. All patients had an observed PHQ-9 or ePHQ-9 score, or both, within 30 days of TRD attestation. The association between (e)PHQ-9 disease severity (with five categorical divisions) and TRD status was evaluated by quantifying the proportion of TRD patients in each severity category. To facilitate interpretation, subsamples with a fixed ratio (1:4) of TRD to non-TRD patients were evaluated.
RESULTS: The dataset included 77,871 patients (29,608 with observed PHQ-9, 61,794 with ePHQ-9). Physician-attested TRD was present in 1,927 (observed PHQ-9 group) and 1,424 (ePHQ-9 group). TRD proportion increased monotonically with severity for both measures. For observed PHQ-9, the TRD proportion in a fixed-ratio subsample ranged from 8.9% (none-minimal) to 36.5% (severe) (r=0.31). The ePHQ-9 showed a stronger relationship (r=0.42), with the TRD proportion ranging from 4.1% (none-minimal) to 52.7% (severe).
CONCLUSIONS: The AI-estimated PHQ-9 showed a stronger association with physician-attested TRD severity than observed PHQ-9 scores. This confirms ePHQ-9's utility as a depression severity measure. Unlike patient-reported observed PHQ-9, ePHQ-9 is derived from psychiatrists’ clinical narratives, providing consistency in the assessments of severity and treatment resistant status.
METHODS: A large US real-world dataset, including claims and electronic medical records with clinical notes, was used to identify major depression patients. A subcohort with physician-attested TRD was labeled via text analysis. All patients had an observed PHQ-9 or ePHQ-9 score, or both, within 30 days of TRD attestation. The association between (e)PHQ-9 disease severity (with five categorical divisions) and TRD status was evaluated by quantifying the proportion of TRD patients in each severity category. To facilitate interpretation, subsamples with a fixed ratio (1:4) of TRD to non-TRD patients were evaluated.
RESULTS: The dataset included 77,871 patients (29,608 with observed PHQ-9, 61,794 with ePHQ-9). Physician-attested TRD was present in 1,927 (observed PHQ-9 group) and 1,424 (ePHQ-9 group). TRD proportion increased monotonically with severity for both measures. For observed PHQ-9, the TRD proportion in a fixed-ratio subsample ranged from 8.9% (none-minimal) to 36.5% (severe) (r=0.31). The ePHQ-9 showed a stronger relationship (r=0.42), with the TRD proportion ranging from 4.1% (none-minimal) to 52.7% (severe).
CONCLUSIONS: The AI-estimated PHQ-9 showed a stronger association with physician-attested TRD severity than observed PHQ-9 scores. This confirms ePHQ-9's utility as a depression severity measure. Unlike patient-reported observed PHQ-9, ePHQ-9 is derived from psychiatrists’ clinical narratives, providing consistency in the assessments of severity and treatment resistant status.
Conference/Value in Health Info
2025-11, ISPOR Europe 2025, Glasgow, Scotland
Value in Health, Volume 28, Issue S2
Code
RWD35
Topic
Clinical Outcomes, Epidemiology & Public Health, Real World Data & Information Systems
Disease
Mental Health (including addition), No Additional Disease & Conditions/Specialized Treatment Areas