Assessing the External Validity of Thin Database in Spain: A Reliable Source of Real-World Data for Observational Retrospective Research

Speaker(s)

Valente A1, Morros M2, Artés M2, Iglesias C1
1Cegedim Health Data, Barcelona, Catalonia, Spain, 2Adelphi Targis, Barcelona, Catalonia, Spain

OBJECTIVES: Valid real-world databases (RWD) are essential across Europe for generating real-world evidence studies. THIN® Spain is a comprehensive RWD containing longitudinal data of approximately 1.8 million anonymized outpatients from 2014 to the present. This study evaluates the external validity of THIN® Spain as a secondary data source for research.

METHODS: An ecological study was conducted to compare the demographic distribution, prevalence of major chronic diseases and drug prescriptions between THIN® and official sources (2020 population data from the National Institute of Statistics (INE), 2020 European Health Survey in Spain, 2017 Spanish National Health Survey and 2021 and Retail pharmacies dispensation of prescriptions by Anatomical Therapeutic Chemical (ATC) classification). Data from 2017 were included to address biases in prevalence due to COVID-19. Descriptive statistics and Pearson correlations were used to assess datasets concordance.

RESULTS: THIN® exhibited strong correlation (r=0.99) with INE's population distribution by age and gender. Prevalence of chronic diseases and drug prescriptions showed correlations of r=0.90 (2017) and r=0.92 (2020). Average prevalence differences were 0.6% (2017) and 1.5% (2020), with depression displaying the highest difference (5.7% 2020 survey 2020 and 7.17% 2017 survey vs. 12.45% THIN® in 2020). Lower prevalence data were observed in all diseases in the Health Survey from 2020 vs 2017. Drug distribution (ATC classification) had correlations of r=0.997 (ATC1), r=0.992 (ATC2), and r=0.986 (ATC3) with official sources.

CONCLUSIONS: The THIN® database exhibits good external validity for real-world evidence research in Spain, as demonstrated by its alignment with population distribution by age and gender, prevalence of diseases, and prescribed medications. The difference in prevalence data in 2020 may be attributed to the pandemic situation that biased prevalence to lower figures and differences in data collection methodologies (self-reported in surveys vs. diagnosed prevalence in THIN®).

Code

RWD76

Topic

Real World Data & Information Systems, Study Approaches

Topic Subcategory

Electronic Medical & Health Records, Reproducibility & Replicability

Disease

Cardiovascular Disorders (including MI, Stroke, Circulatory), Diabetes/Endocrine/Metabolic Disorders (including obesity), Mental Health (including addition), Respiratory-Related Disorders (Allergy, Asthma, Smoking, Other Respiratory)