ADVANCING COMPUTABLE OPERATIONAL DEFINITIONS WITH ONTOLOGY MAPPING AND SYNTHETIC DATA SIMULATION FOR MORE CONSISTENT REAL-WORLD EVIDENCE
Author(s)
Sadie Nordstrand, B.S.1, Aaron Kamauu, MPH, MS, MD2, Joseph Flinders, B.S.1, Adam Hansen, PhD1;
1Geneial, Missouri City, TX, USA, 2Navidence, Inc., Bountiful, UT, USA
1Geneial, Missouri City, TX, USA, 2Navidence, Inc., Bountiful, UT, USA
OBJECTIVES: Real-world evidence (RWE) depends on precise, reproducible study definitions to reduce variability and misclassification across real-world data (RWD) sources, such as registries, electronic health records, and claims databases. Computable Operational Definitions (CODEFs) provide standards-based, machine-readable representations of study elements, and a way to harmonize and share study logic. Building on efforts to develop CODEF libraries and promote consistency, there is an opportunity to expand their utility through ontology mapping and synthetic data simulation, creating a stronger foundation for reliable RWE in both rare and common diseases.
METHODS: The framework integrates three complementary approaches. Ontology mapping aligns RWD from registries, EHRs, claims, laboratory results, genomics, imaging, and patient-reported outcomes to standardized vocabularies such as SNOMED, HPO, RxNorm, and LOINC, establishing a shared semantic base for CODEFs. CODEF development translates standardized concepts into computable definitions of exposures, outcomes, and covariates that are transparent and adaptable across environments. Synthetic data simulation generates cohorts modeled on real-world distributions to validate CODEFs, quantify the impact of definitional choices on cohort size and outcomes, and test portability across sources while preserving privacy.
RESULTS: This framework enables interoperability by harmonizing diverse data elements to recognized standards, and provides privacy-preserving validation of study logic before applying it to sensitive datasets. This supports both rare and common disease research by generating feasibility insights on endpoint selection, sample size, and trial readiness. The novelty of this approach lies in the tight coupling of ontology mapping, CODEF formalization, and privacy-preserving synthetic cohort evaluation within a single, reusable workflow.
CONCLUSIONS: By combining ontology mapping, CODEF development, and synthetic data validation, this framework represents a forward-looking opportunity to advance more consistent, interoperable, and regulator-ready RWE. While particularly valuable in data-sparse rare diseases, the methods are broadly applicable across therapeutic areas, strengthening reproducibility and decision-readiness of evidence for regulatory and clinical use.
METHODS: The framework integrates three complementary approaches. Ontology mapping aligns RWD from registries, EHRs, claims, laboratory results, genomics, imaging, and patient-reported outcomes to standardized vocabularies such as SNOMED, HPO, RxNorm, and LOINC, establishing a shared semantic base for CODEFs. CODEF development translates standardized concepts into computable definitions of exposures, outcomes, and covariates that are transparent and adaptable across environments. Synthetic data simulation generates cohorts modeled on real-world distributions to validate CODEFs, quantify the impact of definitional choices on cohort size and outcomes, and test portability across sources while preserving privacy.
RESULTS: This framework enables interoperability by harmonizing diverse data elements to recognized standards, and provides privacy-preserving validation of study logic before applying it to sensitive datasets. This supports both rare and common disease research by generating feasibility insights on endpoint selection, sample size, and trial readiness. The novelty of this approach lies in the tight coupling of ontology mapping, CODEF formalization, and privacy-preserving synthetic cohort evaluation within a single, reusable workflow.
CONCLUSIONS: By combining ontology mapping, CODEF development, and synthetic data validation, this framework represents a forward-looking opportunity to advance more consistent, interoperable, and regulator-ready RWE. While particularly valuable in data-sparse rare diseases, the methods are broadly applicable across therapeutic areas, strengthening reproducibility and decision-readiness of evidence for regulatory and clinical use.
Conference/Value in Health Info
2026-05, ISPOR 2026, Philadelphia, PA, USA
Value in Health, Volume 29, Issue S6
Code
RWD139
Topic
Real World Data & Information Systems
Topic Subcategory
Data Protection, Integrity, & Quality Assurance, Distributed Data & Research Networks, Reproducibility & Replicability
Disease
No Additional Disease & Conditions/Specialized Treatment Areas