Leveraging Machine Learning To Understand Pet Owner Experiences of Feline Pruritus Through Social Media Listening

Author(s)

Cherry G¹, Mpantis A², Rai T³, Wright A⁴, Brown R², Wells K⁵
¹University of Surrey, Guildford, SRY, UK, ²Athens Technology Center (ATC), Athens, Greece, ³University of Surrey, London, LON, UK, ⁴Zoetis, Babcock Ranch, FL, USA, ⁵University of Surrey, Guildford, Surrey, UK

Presentation Documents

ISPOR24_Cherry_MSR5_POSTER134564.pdf

OBJECTIVES: Feline pruritus is a common disease in the domestic cat with easily observable symptoms, yet remains poorly understood regarding its impact on pet owners' lives and quality of life for affected animals. Traditional research methods, including surveys and interviews, are resource-intensive and necessitate access to representative cohorts, limiting their feasibility. This study harnessed Social Media Listening (SML) to collect pertinent conversations on social media sites. This data is freely available, public domain and without question bias.

METHODS: Keywords, content sources and topics were selected by clinical veterinary dermatology experts and augmented by research literature. Data was collected using ATC’s social intelligence platform. Extracting high quality data using SML required well defined relevance criteria with posts manually labelled relevant or irrelevant. A dataset comprising 5,000 labelled real-world posts and 3,800 synthetic posts (to mitigate data scarcity) was split into 7,000 training and 1,800 test posts for machine learning. Synthetic data was generated by language models like OpenAI GPT using labelled data considered unsuitable for training due length of post. Data cleaning was applied to Twitter and Reddit posts (real and synthetic) to remove posts of less than seven words before lemmatization and spelling correction. Posts >500 words were summarised using a GPT language model. Entities (synonyms and similar words) were extracted using cosine similarity.

RESULTS: A fine-tuned variant of the BERT uncased model, trained on case-specific data over ten epochs, for relevance detection, yielded an F1 score of 0.8026, sensitivity of 0.8838, and precision of 0.7350.

CONCLUSIONS: This study highlights the benefits of employing machine learning in relevance detection, reducing human error and marker fatigue while accelerating data analysis scalability. This innovative approach, on manually labelled posts, offers promising insights into feline pruritus and its ramifications on feline and pet owner well-being, potentially validating health-related quality of life measures.

Conference/Value in Health Info

2024-05, ISPOR 2024, Atlanta, GA, USA

Value in Health, Volume 27, Issue 6, S1 (June 2024)

Code

MSR5

Topic

Epidemiology & Public Health, Methodological & Statistical Research, Patient-Centered Research, Real World Data & Information Systems

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Data Protection, Integrity, & Quality Assurance, Patient-reported Outcomes & Quality of Life Outcomes

Disease

Sensory System Disorders (Ear, Eye, Dental, Skin), Skin (including hair loss) Diseases/Disorders, Veterinary Medicine

Explore Related HEOR by Topic

Presentation