Comparison of Generative AI and Manual Data Programming in a Lupus Health Productivity Loss Study

Author(s)

Tiange Tang, MPH¹, Catherine Mak, MSc¹, Feng Zeng²;
¹Biogen, Cambridge, MA, USA, ²Biogen, Value Evidence Strategy Lead, Cambridge, MA, USA

Presentation Documents

ISPOR25_Zeng_MSR75_POSTER (GPT).pdf

OBJECTIVES: Generative artificial intelligence (AI) is an emerging tool in data programming for real world evidence research. This study aimed to replicate a human-led analysis of health productivity losses evaluation due to systemic lupus erythematosus in a U.S. commercially insured population using AI-generated code.
METHODS: Data from January 1, 2016, to December 31, 2022, were extracted from the IBM® MarketScan® Commercial & Medicare Claims and Health Productivity and Management (HPM) database. The AI replication process evaluation included four steps: (1) researchers completed all tasks using SQL and R, including coding and visualizations of results; (2) human-written code was divided into tasks, with corresponding prompts created for ChatGPT-4; (3) following the input of promptstoChatGPT-4, the ChatGPT-generated codes were tested against the original human results. (4) If ChatGPT-4 could not generate the right codes to complete the task after 10 prompt attempts, human intervention would be introduced to complete the task. The outcomes measured were code generation success, replication accuracy, efficiency (number of commands used), and number of revisions.
RESULTS: Seventy-five tasks were generated, and ChatGPT-4 created code for each. Among these tasks, 77.3% were completed without a need for revisions, while 18.7% required less than 10 prompt revisions to achieve the accurate results. The remaining 4% of tasks, such as calculating Charlson Comorbidity Index scores using International Classification of Diseases (ICD)-9/10 coding, needed human intervention.
CONCLUSIONS: ChatGPT-4 can replicate simple data tasks, such as patient selection, with an acceptable number of prompt iterations. However, at this time human intervention remains necessary for coding more complex tasks.

Conference/Value in Health Info

2025-05, ISPOR 2025, Montréal, Quebec, CA

Value in Health, Volume 28, Issue S1

Code

MSR75

Topic

Methodological & Statistical Research

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics

Disease

SDC: Systemic Disorders/Conditions (Anesthesia, Auto-Immune Disorders (n.e.c.), Hematological Disorders (non-oncologic), Pain)

Presentation (CTI)