Cluster Analysis on RWD to Find Patterns of Multimorbidity: A Conceptual Framework

Speaker(s)

Mulick A1, Langan SM2
1Veramed, Twickenham, UK, 2London School of Hygiene & Tropical Medicine, London, UK

Studies that make use of statistical ‘clustering’ techniques are common in outcomes research, especially in the pursuit of identifying treatment pathways and validating patient reported outcomes. Such techniques are now increasingly being applied to different problems, such as identifying multimorbidity patterns. One of the first decisions to make when analysing data from such a study is which data to analyse: clustering approaches can work with either individual patient data, clustering individuals and retrospectively assessing disease characteristics; or data that has first been summarised across individuals into disease characteristics. Each approach produces clusters with differing interpretations and frequently the research question behind such analyses is not clarified to the degree that the choice of clustering technique is clear.

OBJECTIVES: In this presentation I will outline a simple framework we recently used for building the question so that this choice becomes clear. I will illustrate the application to our study whose goal was to identify groups of health conditions that cluster together in people with eczema and people with asthma.

METHODS: In this work using real-world data from adults in the UK, we identified morbidities in 21 broad categories accumulated throughout the lifespan and investigated multimorbidity patterns in ~1m people with, compared to ~3m people without, eczema/asthma.

RESULTS: Using our framework, we decided to cluster on summarised data by reducing IPD into ‘pairs’ of disease risk and clustering the resulting Jaccard indices.

CONCLUSIONS: The merits and possible conclusions of clustering individuals together, and retrospectively assessing disease characteristics, will be compared with the merits and possible conclusions of clustering disease risks directly, summarised across individuals. I will illustrate our decision process and how we extended existing methods to make characteristic clustering more interpretable and comparable between cases and controls.

Code

EPH190

Topic

Methodological & Statistical Research, Study Approaches

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Electronic Medical & Health Records

Disease

Sensory System Disorders (Ear, Eye, Dental, Skin)