HARNESSING BIG DATA- A METHODOLOGICAL APPROACH TO LINKING ELECTRONIC HEALTH RECORDS WITH PATIENT-REPORTED SURVEY DATA

Author(s)

Liebert R1, Lee LK2, Jaffe DH3, Doane MJ4, Haskell T4
1Kantar Health, New York, NY, USA, 2Kantar Health, San Mateo, CA, USA, 3Kantar Health, Tel Aviv, Israel, 4Kantar Health, Horsham, PA, USA

OBJECTIVES : To assess the feasibility of linking a large nationally representative patient-reported database with an electronic health records (EHR) database to enhanced patient data.

METHODS : Patient-Centered-Research (PaCeR) datasets comprising 3 years (2015-2017; total N=270207) of patient-reported data were included in a HIPAA-compliant linking methodology involving 50 million+ patients from an EHR database. Linking was performed by comparing Protected Health Information from EHR and Personal Identifiable Information from PaCeR. Data used in the linking included first and last name, address, zip code, gender, date of birth, email address, and phone number. Once data was linked, the prevalence of diagnosed type 2 diabetes (T2D), rheumatoid arthritis (RA), psoriasis, inflammatory bowel disease (IBD), depression, and migraine was examined for linked, non-linked, and all PaCeR respondents.

RESULTS : Post linking, 7266 PaCeR respondents were identified as having linked records in the EHR database. Of these, 941 self-reported a physician’s diagnosis for T2D, 308 for RA, 271 for psoriasis, 149 for IBD, 1902 for depression, and 1028 for migraines. Prevalence estimates were highest for the linked respondent subsample, followed by the full PaCeR sample, and lowest for the non-linked subsample. This relationship held for the prevalence of T2D (13.98% vs. 8.91% vs. 8.75%), RA (4.47% vs. 2.92% vs. 2.87%), psoriasis (3.68% vs. 2.72% vs. 2.69%), IBD (1.94% vs. 1.24% vs. 1.22%), depression (25.69% vs. 19.63% vs. 19.44%), and migraine (13.34% vs. 9.81% vs. 9.70%).

CONCLUSIONS : Linking of PaCeR and EHR databases using HIPAA-compliant methods was successful, giving a sub-sample of linked patients for which both patient-reported data and clinical data can be used to address research questions. Prevalence estimates for linked, non-linked, and the full PaCeR samples were as expected, with the highest prevalence being among those seeking care (linked), and the lowest among those who may or may not be seeking care (non-linked).

Conference/Value in Health Info

2018-05, ISPOR 2018, Baltimore, MD, USA

Value in Health, Vol. 21, S1 (May 2018)

Code

PHP173

Topic

Health Service Delivery & Process of Care

Topic Subcategory

Health Care Research

Disease

Multiple Diseases

Explore Related HEOR by Topic


Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×