Leveraging Data Aggregators to Annotate Deidentified Genomic Data Derived from Commercial Laboratory Specimens to Study COVID-19 Severity

Author(s)

Dandiker S1, Latham A1, Tanpaiboon P2, Ratajski AM2, Bare L3, Chanock S4, Fesko YA2, Joseph V1, Offit K1
1Memorial Sloan Kettering Cancer Center, New York, NY, USA, 2Quest Diagnostics, Secaucus, NJ, USA, 3Quest Diagnostics, Moraga, CA, USA, 4National Cancer Institute, Bethesda, MD, USA

BACKGROUND: Understanding the factors influencing COVID-19 severity is needed to improve patient outcomes.

OBJECTIVES: To identify genetic factors associated with severe COVID-19 infections, we used de-identified data sources from a commercial diagnostic laboratory merged with limited clinical annotation from a data aggregator.

METHODS: The study utilized limited clinical data from patients testing positive for SARS-CoV-2 by nucleic acid analysis within a year (median 1 month) of remnant whole-blood collection for clinical care. The remnant specimens, coded by a commercial laboratory (Quest Diagnostics) with study ID (QD-pID), had HIPAA identifiers removed. A Limited Use Data set, including zip code, State, gender, vital status, date of ascertainment, and SARS-CoV-2 results, and de-identified samples were sent to an academic center (MSKCC), which further de-identified specimens by replacing QD-pID with study participant ID (MSK-pID), and sent samples to the National Institutes of Health for germline whole genome SNP array genotyping. The research team used de-identified individual level data provided by HealthVerity, a data aggregator, to determine disease severity as represented by patient claims linked to the SARS-CoV-2 test accession ID and QD-pID. The study, reviewed by the Western Institutional Review Board, was deemed exempt from requirement for consent under 45 CFR § 46.104(d)(4).

RESULTS: Quest Diagnostics provided N=9,241 identifiers of SARS-CoV-2 positive samples, of which 4,644 samples from COVID-19 patients were sent for genotyping. Using HealthVerity’s data marketplace, of the 4,644 samples genotyped, 1,118 (24%) had comprehensive ICD codes for severity stratification, classified as: 914 mild, 10 moderate, and 194 severe COVID-19 positive cases; correlation with genotype is ongoing.

CONCLUSIONS: This design describes a human subject research compliant model for public health studies of the association of inherited genetic variations with COVID-19 severity, by linking remnant biospecimens from a commercial laboratory with clinical annotation provided by a data aggregator and de-identified genotyping by a third-party laboratory.

Conference/Value in Health Info

2024-05, ISPOR 2024, Atlanta, GA, USA

Value in Health, Volume 27, Issue 6, S1 (June 2024)

Code

EPH95

Topic

Epidemiology & Public Health, Real World Data & Information Systems, Study Approaches

Topic Subcategory

Disease Classification & Coding, Health & Insurance Records Systems, Public Health

Disease

Infectious Disease (non-vaccine), No Additional Disease & Conditions/Specialized Treatment Areas

Explore Related HEOR by Topic


Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×