SOCIAL DETERMINANTS OF HEALTH, MORTALITY, AND EXTENDED HOSPITAL STAY IN ATRIAL FIBRILLATION: LEVERAGING OPEN-SOURCE LARGE LANGUAGE MODELS

Author(s)

Vishnu Bharadwaj Suresh, MSc1, Won Lee, PhD2.
1Axtria HEOR/RWE, Berkeley Heights, NJ, USA, 2Axtria HEOR/RWE, San Francisco, CA, USA.
OBJECTIVES: To evaluate the feasibility of using small-scale open-source large language models (LLMs) to extract social determinants of health (SDoH) from clinical discharge summaries and assess whether these features improve prediction of 30-day mortality and extended length of stay (LOS) among patients with atrial fibrillation.
METHODS: We analyzed 1,184 atrial fibrillation admissions from the MIMIC-IV database (2008-2022). Only open-source LLMs with <7 billion parameters were considered; the best-performing model (Qwen3:1.7B) was selected based on extraction quality and stability. Thirteen SDoH attributes spanning employment status, social support, and relationship status were identified from 1,000 discharge summaries. Predictive models (Lasso logistic regression, Random Forest and XGBoost) were trained using clinical features alone and combined with SDoH to predict 30-day mortality and extended LOS (>7 days).
RESULTS: LLM-based extraction achieved a 32.9% detection rate under high-confidence validation. For extended LOS (37.1% prevalence), adding SDoH modestly improved performance: Random Forest AUC increased from 0.81 to 0.82 (+0.6%), and Lasso from 0.77 to 0.78 (+1.3%). Retired employment status was the strongest SDoH predictor, followed by relationship status. For 30-day mortality (9.1% prevalence), clinical-only models outperformed SDoH-augmented models (best AUC: 0.83 vs. 0.79, XGBoost), though limited social support showed moderate associations with mortality (Lasso coefficient = 0.45).
CONCLUSIONS: Small open-source LLMs can reliably extract meaningful SDoH from clinical notes and provide incremental value for predicting extended LOS without reliance on large proprietary models. Observed associations between SDoH and outcomes highlight opportunities for tailored discharge planning and care coordination.

Conference/Value in Health Info

2026-05, ISPOR 2026, Philadelphia, PA, USA

Value in Health, Volume 29, Issue S6

Code

RWD144

Topic

Real World Data & Information Systems

Topic Subcategory

Health & Insurance Records Systems

Disease

SDC: Cardiovascular Disorders (including MI, Stroke, Circulatory)

Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×