REAL-WORLD DATABASES IN CHINA: REGIONAL LANDSCAPE, INTEGRATION CHALLENGES, AND METHODOLOGICAL PRACTICE
Author(s)
Adele Li, MBA;
Vinzent Strategies Economic Management Consulting Service Co., Ltd., Shanghai, China
Vinzent Strategies Economic Management Consulting Service Co., Ltd., Shanghai, China
OBJECTIVES: China’s real‑world data (RWD) ecosystem is regionalized and heterogeneous, spanning city/region‑level electronic health records (EHRs), insurance/claims‑like databases, and disease/quality registries (e.g., national cancer registry, etc.). Understanding source characteristics and integration challenges is essential for generating fit‑for‑purpose real‑world evidence (RWE).
METHODS: We reviewed representative regional EHRs, insurance databases, and registries, comparing data architecture, geographic representativeness, refresh cycles, hospital‑tier coverage, longitudinal continuity, and coding standards (diagnoses, procedures, medications, labs). Drawing on multi‑project practice, we synthesized governance and bias‑mitigation pathways: code mapping and master data dictionaries; visit/episode reconstruction; variable harmonization across sources; handling missingness and unequal follow‑up (multiple imputation, weighting); privacy‑compliant linkage strategies (where permissible); and external benchmarking with prespecified sensitivity analyses.
RESULTS: Regional EHRs capture local clinical practice with provider‑dependent timeliness (bi‑weekly/monthly updates); insurance databases offer systematic medication and cost perspectives; registries provide focused clinical granularity and quality metrics but may be selective and center‑biased. Common limitations include limited national representativeness, uneven hospital coverage, heterogeneous definitions/units, discontinuous follow‑up, data sparsity, and constrained cross‑institution linkage. Effective use requires fit‑for‑purpose study design, prespecified statistical analysis plans, transparent provenance and quality controls (completeness, plausibility, outlier handling), and layered reporting of data quality. Methodologically, standardized coding (ICD/ATC/local), cohort construction and episode linking, strategies for interrupted care, and careful causal inference with robustness checks are critical to mitigate bias and enhance credibility.
CONCLUSIONS: Regional EHRs, insurance databases, and registries collectively enable impactful RWE in China but demand rigorous integration and explicit bias mitigation given fragmentation and heterogeneity. Deep knowledge of source‑specific features, limitations, and complementarities—combined with localized analytical expertise and strong data governance—can enhance scientific validity and decision relevance as infrastructure and interoperability evolve.
METHODS: We reviewed representative regional EHRs, insurance databases, and registries, comparing data architecture, geographic representativeness, refresh cycles, hospital‑tier coverage, longitudinal continuity, and coding standards (diagnoses, procedures, medications, labs). Drawing on multi‑project practice, we synthesized governance and bias‑mitigation pathways: code mapping and master data dictionaries; visit/episode reconstruction; variable harmonization across sources; handling missingness and unequal follow‑up (multiple imputation, weighting); privacy‑compliant linkage strategies (where permissible); and external benchmarking with prespecified sensitivity analyses.
RESULTS: Regional EHRs capture local clinical practice with provider‑dependent timeliness (bi‑weekly/monthly updates); insurance databases offer systematic medication and cost perspectives; registries provide focused clinical granularity and quality metrics but may be selective and center‑biased. Common limitations include limited national representativeness, uneven hospital coverage, heterogeneous definitions/units, discontinuous follow‑up, data sparsity, and constrained cross‑institution linkage. Effective use requires fit‑for‑purpose study design, prespecified statistical analysis plans, transparent provenance and quality controls (completeness, plausibility, outlier handling), and layered reporting of data quality. Methodologically, standardized coding (ICD/ATC/local), cohort construction and episode linking, strategies for interrupted care, and careful causal inference with robustness checks are critical to mitigate bias and enhance credibility.
CONCLUSIONS: Regional EHRs, insurance databases, and registries collectively enable impactful RWE in China but demand rigorous integration and explicit bias mitigation given fragmentation and heterogeneity. Deep knowledge of source‑specific features, limitations, and complementarities—combined with localized analytical expertise and strong data governance—can enhance scientific validity and decision relevance as infrastructure and interoperability evolve.
Conference/Value in Health Info
2026-05, ISPOR 2026, Philadelphia, PA, USA
Value in Health, Volume 29, Issue S6
Code
RWD67
Topic
Real World Data & Information Systems
Topic Subcategory
Health & Insurance Records Systems
Disease
No Additional Disease & Conditions/Specialized Treatment Areas