GUIDING RECORD LINKAGE METHOD (RL) SELECTION IN HEOR: A TARGETED REVIEW AND STRUCTURED DECISION FRAMEWORK
Author(s)
Bruce Morrison, M.S.Ed.;
BeiGene / BeOne Medicines Ltd, HEOR & RWE, Cambridge, MA, USA
BeiGene / BeOne Medicines Ltd, HEOR & RWE, Cambridge, MA, USA
OBJECTIVES: Record linkage (RL) use remains uncommon in health economics and outcomes research (HEOR) and patient-reported outcome (PRO) studies. This is unfortunate because multi-source, real-world data (RWD) datasets offer benefits otherwise difficult to obtain, especially in oncology where clinical, treatment, laboratory, and symptom information is often siloed across different systems. Linking cancer registry and electronic health record datasets can improve completeness of tumor staging, treatment, exposures, and other variables related to PRO research. FDA guidance emphasizes the need for documenting linkage quality and provenance as linkage use grows. Opportunities for RL in HEOR and PRO contexts can be limited by identifier sparsity and inconsistency as well as linkage error bias risk. Despite the wealth of record linkage techniques available, such as Fellegi-Sunter models, Bayesian linkage, (semi-)supervised machine learning (ML), and privacy-preserving record linkage (PPRL), there is limited integrated guidance for the selection of methods and implementation tools under real-world HEOR/PRO space constraints. We summarized methodological and applied probabilistic record linkage literature to 1) compare linkage methods, and 2) present a decision tree that assists decisions regarding linkage appropriateness and method selection for a given HEOR/PRO use-case.
METHODS: A targeted literature review of foundational, contemporary, and regulatory sources (n=20) was conducted. The review covered data quality challenges, comparator functions, uncertainty propagation, linkage error bias, and privacy preservation. Compiled sources were integrated into a structured decision framework that guides users through constraint and requirement checks to determine linkage feasibility and best method by use-case.
RESULTS: The synthesis indicates scenarios where linkage is impossible or inadvisable, presents trade-offs among interpretable PRL, Bayesian RL, (semi-)supervised ML, and PPRL techniques while incorporating HEOR/PRO constraints like regulatory auditability and linkage misclassification sensitivity in outcomes.
CONCLUSIONS: This review provides literature-backed guidance for selecting linkage methods and tools in multi-source RWD environments, ensuring ethical and analytical integrity.
METHODS: A targeted literature review of foundational, contemporary, and regulatory sources (n=20) was conducted. The review covered data quality challenges, comparator functions, uncertainty propagation, linkage error bias, and privacy preservation. Compiled sources were integrated into a structured decision framework that guides users through constraint and requirement checks to determine linkage feasibility and best method by use-case.
RESULTS: The synthesis indicates scenarios where linkage is impossible or inadvisable, presents trade-offs among interpretable PRL, Bayesian RL, (semi-)supervised ML, and PPRL techniques while incorporating HEOR/PRO constraints like regulatory auditability and linkage misclassification sensitivity in outcomes.
CONCLUSIONS: This review provides literature-backed guidance for selecting linkage methods and tools in multi-source RWD environments, ensuring ethical and analytical integrity.
Conference/Value in Health Info
2026-05, ISPOR 2026, Philadelphia, PA, USA
Value in Health, Volume 29, Issue S6
Code
RWD68
Topic
Real World Data & Information Systems
Topic Subcategory
Data Protection, Integrity, & Quality Assurance, Distributed Data & Research Networks
Disease
No Additional Disease & Conditions/Specialized Treatment Areas