USING PROPENSITY MATCHING AND IMPUTATION METHODS TO INTEGRATE PATIENT-REPORTED SURVEY DATA WITH ELECTRONIC HEALTH RECORDS IN TYPE 2 DIABETES

Author(s)

Lee LK1, Liebert R2, Gupta S2, Flores NM1, Haskell T3
1Kantar Health, San Mateo, CA, USA, 2Kantar Health, New York, NY, USA, 3Kantar Health, Horsham, PA, USA

OBJECTIVES: Electronic Health Records (EHR) data and patient-reported survey data have their own strengths and limitations. This study aimed to develop a method to integrate disparate datasets—patient-reported survey data with EHR data—to provide a more complete view of disease characteristics and health outcomes among patients with type 2 diabetes (T2D). METHODS: The two sets of data sources included: 1) data (2012 to current) from a large nationally representative US ambulatory EHR database and 2) data from the 2016 US National Health and Wellness Survey (NHWS), a nationally representative, self-administered, internet-based survey of adults (≥18 years). T2D patients were identified from NHWS if they self-reported a physician diagnosis of T2D. Adult T2D patients were identified in the EHR using diagnosis codes (ICD-9, ICD-10 and SNOMED codes), or text strings indicating T2D in the diagnosis field, or patients who had two or more prescriptions of oral antidiabetic medications or GLP-1 injections. Common variables between the two data sources included demographics (e.g., age, gender, ethnicity) and comorbidities (e.g., Charlson Comorbidity Index). A matching algorithm based on propensity to be in the NHWS data set was used to match patients from NHWS to those in the EHR, where predictors were the common variables. With the matched dataset, imputation was utilized to impute values of interest (e.g., HbA1c, health-related quality of life) where missing. RESULTS: A total of 3,347,750 patients with T2D were identified in the EHR and 4,113 patients with T2D were identified in NHWS. The mean age of the EHR sample was 64 years old and 58 years for the NHWS sample. The final matched sample included 12,399 patients with T2D (1:2 match). CONCLUSIONS: Using propensity matching and imputation methods, disparate datasets could be combined to provide a more informative dataset of patient and disease characteristics.

Conference/Value in Health Info

2017-05, ISPOR 2017, Boston, MA, USA

Value in Health, Vol. 20, No. 5 (May 2017)

Code

PRM63

Topic

Methodological & Statistical Research, Real World Data & Information Systems

Topic Subcategory

Confounding, Selection Bias Correction, Causal Inference, Reproducibility & Replicability

Disease

Diabetes/Endocrine/Metabolic Disorders

Explore Related HEOR by Topic


Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×