Evaluating Natural Language Processing Customization Techniques for Healthcare-Related Application Development
Author(s)
ABSTRACT WITHDRAWN
OBJECTIVES: In natural language processing (NLP), models are classified by size into Small Language Models (SLMs) and Large Language Models (LLMs). Both are generative artificial intelligence systems used in similar ways. When developing healthcare applications, factors such as model size, efficiency, costs, and specific use cases are crucial for selecting the most appropriate model. Our goal was to develop a healthcare web application, using LLMs and make it available for external testing. We tailored the application for primary and secondary healthcare prevention, with a focus on healthy lifestyle and cervical cancer screenings, and collected feedback via a user experince questionaire. We evaluated whether a customized LLMs or SLMs would be more suitable for this purpose.
METHODS: We reviewed scientific literature on SLMs and experimented with various LLM customization methods using the GPT-4 model. We explored OpenAI's custom-GPT builder and prepared a training dataset to fine-tune the GPT-4 model. Prompt engineering techniques were also tested for customization. Information on cervical cancer was referenced from WHO guidelines. Python and Streamlit were used to make the application testable.
RESULTS: Customizing an LLM using its application programming interface, such as GPT-4, is considerably easier than developing an SLM from the beginning. OpenAI's custom-GPT building process is user-friendly and requires no programming skills but offers limited customization options. Fine-tuning the GPT-4 model is complex and heavily dependent on the quality of the training dataset, leading to higher rates of hallucinations and errors. Prompt engineering proved to be the most effective method, providing the greatest flexibility and consistency.
CONCLUSIONS: The options for creating customized language models are rapidly expanding, sometimes eliminating the need for programming skills. SLMs are more cost-effective and can function offline, but LLMs offer greater customization, exhibit more empathetic responses, and possess more extensive informational capabilities.
Conference/Value in Health Info
Code
HSD26
Topic
Epidemiology & Public Health
Topic Subcategory
Public Health
Disease
Medical Devices, Oncology