ZERO-SHOT LUNG CANCER RISK PREDICTION?FROM?LONGITUDINAL?ELECTRONIC?HEALTH RECORDS?WITH CHAIN-OF-AGENTS?FRAMEWORK

Author(s)

Sihang Zeng, BS¹, Youngwon Kim, PhD¹, Wilson Lau, PhD¹, Ehsan Alipour, MD, PhD¹, Ruth Etzioni, PhD², Meliha Yetisgen, PhD³, Anand Oka, PhD¹, Jay Nanduri, MBA¹.
¹Truveta, Bellevue, WA, USA, ²Fred Hutch Cancer Center, Seattle, WA, USA, ³University of Washington, Seattle, WA, USA.

Presentation Documents

ISPOR 2026 Poster CoA.pdf

OBJECTIVES: Early identification of individuals at higher risk for lung cancer can improve outcomes and help target screening resources.  We evaluate whether a large language model (LLM)-based chain-of-agents (CoA) framework can  estimate 1-year lung cancer risk directly from raw longitudinal electronic health record (EHR) data, reducing the need for data cleaning, feature engineering, and task-specific model training  required by traditional machine  learning  (ML) models.
METHODS: Using Truveta Data (de-identified EHR for 120 million patients from leading US health systems), we identified lung cancer cases with clinician-curated diagnostic codes and randomly sampled a test cohort of 500 cases and 125,000 controls. For each patient, all EHR history prior to one year before diagnosis (or index date) was used. The CoA framework applied sequential LLM agents to summarize key clinical events from chronological EHR segments and aggregated a consolidated risk profile to predict a 1-year lung cancer risk score from 1 to 10. We compared CoA performance with common ML models like XGBoost, as well as a single-agent LLM baseline.
RESULTS: CoA based on GPT-4.1-mini achieved strong discrimination (AUROC 0.871; 95% CI: 0.855-0.885). Using a threshold chosen to balance sensitivity and specificity, CoA achieved NPV 0.999, sensitivity 0.772, specificity 0.825, and PPV 0.017 in this low-incidence cohort. Performance was comparable to, or slightly lower than, that of trained ML models, but was obtained without feature engineering and model training. Further evaluation showed that CoA produced more complete and temporally coherent clinical reasoning than the single-agent LLM, aligned well with clinical knowledge.
CONCLUSIONS: The zero-shot LLM-based CoA framework can predict lung cancer risk  directly from heterogeneous real-world EHR, with clinically meaningful reasoning and performance comparable to ML models, while eliminating  the  costly  data pre-processing and training. This approach may lower implementation barriers and support scalable deployment of early detection tools to  improve lung cancer outcomes. 

Conference/Value in Health Info

2026-05, ISPOR 2026, Philadelphia, PA, USA

Value in Health, Volume 29, Issue S6

Code

MSR67

Topic

Methodological & Statistical Research

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics

Disease

SDC: Oncology

Presentation (CTI)