USE OF A COMMON DATA MODEL TO FACILITATE RAPID ANALYTICS SUPPORTING HEALTH OUTCOMES RESEARCH

Author(s)

Kim H¹, Joo S"¹, Anstatt D¹, Morrison J¹, Germscheid L², Murray R²
¹Bristol-Myers Squibb Company, Pennington, NJ, USA, ²UBC, Harrisburg, PA, USA

Presentation Documents

PRM39--strong-u-kim-h-u-sup-1-sup-strong-joo-s-sup-1-sup-anstatt-d-sup-1-sup-morrison-j-sup-1-sup-germscheid-l-sup-2-sup-murray-r-sup-2-sup-br-sup-1-sup-bristol-myers-squibb-company-pennington-nj-usa-sup-2-sup-ubc-harrisburg-pa-usa

OBJECTIVES: Research using observational data is complex and costly due to massive size and disparate format. Consequently, scientists frequently use one data source when conducting a study, which can result in inaccurate reflection of the target population or insufficient sample sizes. Use of a Common Data Model (CDM) standardizes format across disparate observational databases, facilitating consistent and efficient application of research methods. The objective of this study was to rapidly and efficiently estimate how well disparate sources reflect “known” population characteristics, utilizing standardized patient selection, analyses and visualization software on data that has been transformed into a CDM. METHODS: Using CEWorks® software, AF patients treated with warfarin or NOA were selected from multiple Administrative Claims and EHR databases previously transformed into a CDM format. Rates of selected disease states were calculated using a CEWorks analysis module, and then compared to results published in a recent study. Results across disparate databases were imported into a visualization tool for further comparison among data sources. RESULTS: Preliminary results indicate that rates of selected disease state across disparate databases are similar to those published in a previous study. Total number and percentage reported in the study for NOAC [10,789 - 35%] and warfarin [19,964 - 65%] patients matched comparably at 8,093 [29%] NOAC patients and 20,133 [65%] warfarin patients in CDM version. Different data sources also show similar prevalence rates on selected disease state, although such similarities show some gaps introduced by region and other demographic variables. CONCLUSIONS: Use of a CDM enables rapid data analysis to be performed across multiple data sources, enabling meaningful comparisons across disparate data. Results are easily linked to data visualization tools for further analysis. This study has far reaching implications for data scientists using multiple, large data sources, and further study is in need to verify its practicality.

Conference/Value in Health Info

2014-05, ISPOR 2014, Palais des Congres de Montreal

Value in Health, Vol. 17, No. 3 (May 2014)

Code

PRM39

Topic

Real World Data & Information Systems

Topic Subcategory

Reproducibility & Replicability

Disease

Cardiovascular Disorders

Explore Related HEOR by Topic

Real-World Data

Presentation