Evaluating the Representativeness of a Real-World Oncology Database Developed by the J-CONNECT Consortium: Comparison With Japan's National Cancer Registry
Author(s)
Masafumi Okada, MD, PhD1, Shigemi Matsumoto, MD, PhD2.
1Prime Research Institute for Medical RWD, Inc., Kyoto, Japan, 2Department of Real World Data R&D, Kyoto University, Kyoto, Japan.
1Prime Research Institute for Medical RWD, Inc., Kyoto, Japan, 2Department of Real World Data R&D, Kyoto University, Kyoto, Japan.
OBJECTIVES: The J-CONNECT Consortium is a collaborative academic network in Japan that develops a real-world database of chemotherapy-treated solid cancer patients using electronic medical records (EMRs). Currently, data from 10 hospitals are available for analysis, although more institutions have already joined the consortium. The database includes core cancer registration data, as well as comprehensive prescription, injection, and laboratory test records extracted from EMRs. The database is constructed under an academic framework with an opt-out informed consent process. For more detailed analyses, researchers can access additional data through a federated model or by directly collaborating with individual hospitals.
Compared to the national cancer registry, this database provides richer clinical detail and greater accessibility for industrial research. However, its representativeness is constrained by the limited number of participating institutions. We evaluated the representativeness of the database by comparing cancer type and patient demographic distributions with those from the national cancer registry.
METHODS: Coverage of database cases was calculated by cancer site. Stratified coverage by age and gender was evaluated for stomach, esophagus, colon, lung, prostate, and breast cancers using publicly available national cancer registry data. Coefficient of variation (CV) was used to assess variability in coverage across strata.
RESULTS: Site-specific coverage ranged from 0.35% to 1.60% (CV = 0.575). Age-stratified CV ranged from 0.220 (breast) to 0.511 (stomach). Gender-stratified CV (excluding breast and prostate cancers) ranged from 0.002 (stomach) to 0.073 (colon).
CONCLUSIONS: Although the database includes only chemotherapy-treated patients, gender distributions were largely consistent with national data. While variability existed in age and cancer site distributions, focusing on cancer types with less age-related variation suggests this real-world oncology database may serve as a representative resource for Japanese solid cancer populations.
Compared to the national cancer registry, this database provides richer clinical detail and greater accessibility for industrial research. However, its representativeness is constrained by the limited number of participating institutions. We evaluated the representativeness of the database by comparing cancer type and patient demographic distributions with those from the national cancer registry.
METHODS: Coverage of database cases was calculated by cancer site. Stratified coverage by age and gender was evaluated for stomach, esophagus, colon, lung, prostate, and breast cancers using publicly available national cancer registry data. Coefficient of variation (CV) was used to assess variability in coverage across strata.
RESULTS: Site-specific coverage ranged from 0.35% to 1.60% (CV = 0.575). Age-stratified CV ranged from 0.220 (breast) to 0.511 (stomach). Gender-stratified CV (excluding breast and prostate cancers) ranged from 0.002 (stomach) to 0.073 (colon).
CONCLUSIONS: Although the database includes only chemotherapy-treated patients, gender distributions were largely consistent with national data. While variability existed in age and cancer site distributions, focusing on cancer types with less age-related variation suggests this real-world oncology database may serve as a representative resource for Japanese solid cancer populations.
Conference/Value in Health Info
2025-09, ISPOR Real-World Evidence Summit 2025, Tokyo, Japan
Value in Health Regional, Volume 49S (September 2025)
Code
RWD40
Topic Subcategory
Distributed Data & Research Networks
Disease
SDC: Oncology