Natural Language Processing for Identification of Negative Symptoms in Patients With and Without Cognitive Impairment Associated With Schizophrenia (CIAS)
Author(s)
Theresa Cassidy, MPH1, Ashley Wu, MHS2, Benjamin Fell, PhD3, Monika Frysz, PhD4, Georgina Ireland, PhD3, Suzanne St Rose, PhD2, Ceyda Uysal, MSc3, Rashmi Patel, MD, PhD5.
1Boehringer Ingelheim, Ridgefield, CT, USA, 2Boehringer Ingelheim International GmbH, Ingelheim am Rhein, Germany, 3Akrivia Health, Department of Research, Oxford, United Kingdom, 4Boehringer Ingelheim, Ltd., Real World Evidence CoE, Bracknell, United Kingdom, 5Department of Psychiatry, University of Cambridge, Cambridge, United Kingdom.
1Boehringer Ingelheim, Ridgefield, CT, USA, 2Boehringer Ingelheim International GmbH, Ingelheim am Rhein, Germany, 3Akrivia Health, Department of Research, Oxford, United Kingdom, 4Boehringer Ingelheim, Ltd., Real World Evidence CoE, Bracknell, United Kingdom, 5Department of Psychiatry, University of Cambridge, Cambridge, United Kingdom.
OBJECTIVES: Cognitive impairment (CI) and negative symptoms (NS) are key features of schizophrenia. This study used natural language processing (NLP) to identify patients with CIAS and NS and compare demographic and clinical characteristics by CIAS/NS status.
METHODS: Data from UK patients ≥18 years with first schizophrenia diagnosis and mention of CI (1-Jan-2005 and 31-Dec-2023) from the Akrivia Health secondary mental healthcare dataset were evaluated. Patients with dementia, mild CI, intellectual disability, or first CI mention >3 months post-diagnosis were excluded. CIAS status was previously determined from unstructured electronic healthcare records (EHRs) by NLP, using 35 clinician-identified keywords indicating NS to determine patient NS status. The NS domain of the Positive and Negative Syndrome Scale was used as a reference framework. NS keyword prevalence is reported descriptively. Associations between patient characteristics and CIAS/NS status were analysed by multinomial logistic regression.
RESULTS: Of 35,710 patients identified with schizophrenia, 23,217 (65.0%) were classified as having NS; of these, 18,874 (81.3%) had CIAS. The overall cohort was stratified by CIAS/NS status (Y, yes; N, no): NS‑N/CIAS-N, n=9,957; NS‑N/CIAS-Y, n=2,536; NS-Y/CIAS-N, n=4,343; and NS-Y/CIAS-Y, n=18,874. Common NS keywords were more prevalent in CIAS-Y than CIAS-N patients, at 64.9%/15.0% (withdrawn behaviour), 35.9%/3.3% (social interaction finding) and 35.6%/3.2% (flat affect), respectively. NS-Y/CIAS-Y patients were more likely to be male, had the lowest mean (SD) age (41.3 [14.5] years), highest median (IQR) medication count (5 [2-11]) (all p<0.001), and a higher frequency of prescriptions for second-generation antipsychotics at 78.7% vs 53.4% and 55.6% for the NS‑N/CIAS-Y and NS‑Y/CIAS-N groups, respectively.
CONCLUSIONS: The NLP of patient EHRs enabled NS identification and showed that NS were more common among CIAS-Y patients, and CIAS was more common among NS-Y patients. Applying NLP to EHRs may allow earlier NS identification in CIAS-Y patients and ensure the unmet needs in this patient population are recognised.
METHODS: Data from UK patients ≥18 years with first schizophrenia diagnosis and mention of CI (1-Jan-2005 and 31-Dec-2023) from the Akrivia Health secondary mental healthcare dataset were evaluated. Patients with dementia, mild CI, intellectual disability, or first CI mention >3 months post-diagnosis were excluded. CIAS status was previously determined from unstructured electronic healthcare records (EHRs) by NLP, using 35 clinician-identified keywords indicating NS to determine patient NS status. The NS domain of the Positive and Negative Syndrome Scale was used as a reference framework. NS keyword prevalence is reported descriptively. Associations between patient characteristics and CIAS/NS status were analysed by multinomial logistic regression.
RESULTS: Of 35,710 patients identified with schizophrenia, 23,217 (65.0%) were classified as having NS; of these, 18,874 (81.3%) had CIAS. The overall cohort was stratified by CIAS/NS status (Y, yes; N, no): NS‑N/CIAS-N, n=9,957; NS‑N/CIAS-Y, n=2,536; NS-Y/CIAS-N, n=4,343; and NS-Y/CIAS-Y, n=18,874. Common NS keywords were more prevalent in CIAS-Y than CIAS-N patients, at 64.9%/15.0% (withdrawn behaviour), 35.9%/3.3% (social interaction finding) and 35.6%/3.2% (flat affect), respectively. NS-Y/CIAS-Y patients were more likely to be male, had the lowest mean (SD) age (41.3 [14.5] years), highest median (IQR) medication count (5 [2-11]) (all p<0.001), and a higher frequency of prescriptions for second-generation antipsychotics at 78.7% vs 53.4% and 55.6% for the NS‑N/CIAS-Y and NS‑Y/CIAS-N groups, respectively.
CONCLUSIONS: The NLP of patient EHRs enabled NS identification and showed that NS were more common among CIAS-Y patients, and CIAS was more common among NS-Y patients. Applying NLP to EHRs may allow earlier NS identification in CIAS-Y patients and ensure the unmet needs in this patient population are recognised.
Conference/Value in Health Info
2025-11, ISPOR Europe 2025, Glasgow, Scotland
Value in Health, Volume 28, Issue S2
Code
SA71
Topic
Study Approaches
Disease
Mental Health (including addition)