Data Linkage in Practice: A Living Systematic Review of Clinical Trials in the United States (US) Utilizing Linkage to Real-World Data
Author(s)
Evelyn J. Rizzo, MSc1, Kevin Kallmes, BS, MA, JD2, Thomas Dougherty3.
1Mobility HEOR, AKRON, OH, USA, 2Nested Knowledge, St. Paul, MN, USA, 3RWE Strategy Lead, Novo Nordisk, Dallas, PA, USA.
1Mobility HEOR, AKRON, OH, USA, 2Nested Knowledge, St. Paul, MN, USA, 3RWE Strategy Lead, Novo Nordisk, Dallas, PA, USA.
OBJECTIVES: Data linkage and tokenization are increasingly being adopted to address current limitations in clinical trials; however, research on topics and implementation of linkage/tokenization in practice is limited. The objectives of this systematic review were to describe and quantify examples of published clinical trials in the US that used data linkage and evaluate the analytical goals and uses of linked data.
METHODS: Relevant articles were identified through PubMed and ClinicalTrials.gov searches implemented on an artificial intelligence-assisted systematic literature review platform (AutoLit, Nested Knowledge), for publications between 2014-2025. Articles were included if they reported a pharmacological intervention and a US-based study population. Study background, objective, patient disease state, type of linked data, linked data elements, and linkage methods were extracted from each study.
RESULTS: Out of 902 abstracts screened, 31 publications reporting trials with linkage were included in this review. The studies were sponsored by industry(8), academic(6) and government institutions(17). There were 11 interventional trials, 1 phase II, 14 phase III, and 5 phase IV trials. Trial data were linked with real-world datasets, including claims data(74.2%), registries(16%), and electronic health records(10%). The disease states were: Cardiovascular Risk(10), Cancer/Tumors(5), Aortic Stenosis(2), Kidney Disease(3), Women’s Health(3), and Other(7)[ER1] . Most studies used deterministic linkage(61.3%), followed by methods which were hybrid or unclear(25.8%), and probabilistic linkage(12.9%). Of the 28 studies that reported the percentage of the population that was successfully linked, the range was 11.6%-100% and average of 64.7%. The key objectives for using linkage were efficacy(9), cost(5), methodology/validation(7), safety/adverse events(3), feasibility(3), survival(3), and medical history(1).
CONCLUSIONS: This review demonstrates the increased use of data linkage by US-based government, industry and academic centers in clinical trials for drugs for a broad range of therapeutic areas and objectives. These findings show a burgeoning role for linkage in expanding outcome collection and analysis across diverse disease areas.
METHODS: Relevant articles were identified through PubMed and ClinicalTrials.gov searches implemented on an artificial intelligence-assisted systematic literature review platform (AutoLit, Nested Knowledge), for publications between 2014-2025. Articles were included if they reported a pharmacological intervention and a US-based study population. Study background, objective, patient disease state, type of linked data, linked data elements, and linkage methods were extracted from each study.
RESULTS: Out of 902 abstracts screened, 31 publications reporting trials with linkage were included in this review. The studies were sponsored by industry(8), academic(6) and government institutions(17). There were 11 interventional trials, 1 phase II, 14 phase III, and 5 phase IV trials. Trial data were linked with real-world datasets, including claims data(74.2%), registries(16%), and electronic health records(10%). The disease states were: Cardiovascular Risk(10), Cancer/Tumors(5), Aortic Stenosis(2), Kidney Disease(3), Women’s Health(3), and Other(7)[ER1] . Most studies used deterministic linkage(61.3%), followed by methods which were hybrid or unclear(25.8%), and probabilistic linkage(12.9%). Of the 28 studies that reported the percentage of the population that was successfully linked, the range was 11.6%-100% and average of 64.7%. The key objectives for using linkage were efficacy(9), cost(5), methodology/validation(7), safety/adverse events(3), feasibility(3), survival(3), and medical history(1).
CONCLUSIONS: This review demonstrates the increased use of data linkage by US-based government, industry and academic centers in clinical trials for drugs for a broad range of therapeutic areas and objectives. These findings show a burgeoning role for linkage in expanding outcome collection and analysis across diverse disease areas.
Conference/Value in Health Info
2025-11, ISPOR Europe 2025, Glasgow, Scotland
Value in Health, Volume 28, Issue S2
Code
RWD49
Topic
Real World Data & Information Systems
Topic Subcategory
Health & Insurance Records Systems
Disease
No Additional Disease & Conditions/Specialized Treatment Areas