Automated Non-Interventional Research Protocol Generation: A Case Study in Melanoma

Author(s)

Langham J¹, Benbow E¹, Reason T², Malcolm B³, Gimblett A¹, Hill N⁴
¹Estima Scientific Ltd, London, UK, ²Estima Scientific Ltd, South Ruislip, LON, UK, ³Bristol Myers Squibb, Middlesex, LON, UK, ⁴Bristol Myers Squibb Company, Princeton, NJ, USA

Presentation Documents

ISPOR24_Langham_RWD137_POSTER138572.pdf

OBJECTIVES: Assess the potential to utilise large language models, such as GPT-4, for the automation of Non-Interventional Research (NIR) study protocols to enhance efficiency in the ability to conduct research

METHODS: To automate the development of specific sections of a protocol a Python API was used to send prompts to GPT-4 and receive output. Prompts were developed to provide specific inputs for each protocol, such as the population of interest, the aims and objectives, and the data source. Further information about the structure and content required for each section, and a template or example text for GPT-4 to modify was also developed and provided for each protocol section. The accuracy and completeness of GPT-4’s outputs were qualitatively assessed against the original human-produced protocol content, focusing on the identification of critical points, and noting any omissions or inaccuracies.

RESULTS: Two protocols for retrospective cohort studies with objectives to describe patient characteristics, treatment patterns, and clinical outcomes for melanoma patients were autogenerated. Overall, there was close alignment between the original text and autogenerated text for the Study Design and Study Population sections. GPT-4 gave general aspects of data collection but lacked specifics related to the data sources and their use unless it was specified in the prompt. There was a substantial match in the description of statistical methods, with GPT-4 following the overall guidelines and providing clear methodology for analysis for each objective.

CONCLUSIONS: GPT-4 demonstrates potential in automating the drafting of sections of NIR protocols, with a high degree of alignment with original human-generated content. There was no inaccurate text reported. Where details were missing, the GPT-4 text could be enhanced by incorporating more specific details in the prompts, for example, subgroup analyses, how patients are selected from a data source, and the definition of the index date.

Conference/Value in Health Info

2024-05, ISPOR 2024, Atlanta, GA, USA

Value in Health, Volume 27, Issue 6, S1 (June 2024)

Code

RWD137

Topic

Methodological & Statistical Research, Study Approaches

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Prospective Observational Studies

Disease

No Additional Disease & Conditions/Specialized Treatment Areas, Oncology

Explore Related HEOR by Topic

Methodology

Presentation