Benchmarking Clinical Trial Diversity: Methods to Characterize Eligibility-Based Representativeness
Speaker(s)
Benedum C1, Griffith SD1, Bozkurt S2, Sarkar S1
1Flatiron Health, New York, NY, USA, 2Emory University, Atlanta, GA, USA
Presentation Documents
OBJECTIVES: Regulators and payers have highlighted the need for inclusive and diverse clinical trials to ensure that drugs provide clinical benefit and value to the full approved indication population. Therefore, it is important to benchmark the inclusiveness of trials via measuring the representativeness of the study population. While multiple methods from various scientific disciplines exist for assessing representativeness, they have not been compared directly, and no recommendations exist for which methods to apply in practice. We aim to describe methods for assessing the representativeness of clinical trial eligibility criteria and to evaluate their utility when applied to real-world examples.
METHODS: We compared three representativeness scoring methods from three fields: log disparity (LD; machine learning fairness), GIST (bioinformatics), and a propensity score (PS)-based metric (statistics). LD summarizes representativeness for a specific characteristic, while GIST and PS summarize representativeness across many characteristics. Using real-world data from the nationwide Flatiron Health electronic health record-derived de-identified database as a benchmark (N=50,263), we selected patients who met trial eligibility and therapy indication criteria for nine cross-sponsor phase III advanced non-small cell lung cancer trials.
RESULTS: GIST and PS-based metrics ranged between 0.17–0.67 and 1.07–2.74, respectively, suggesting low to moderate representativeness according to field-specific benchmarks. GIST and PS metrics were strongly correlated (ρ=-0.90); however, PS included additional information on clinical and demographic characteristics not incorporated into GIST, which is based on eligibility criteria alone. LD analysis, with scores ranging from 37%–201%, offered unique and valuable insights. It provided complementary but more granular perspectives, estimating the magnitude of under/overrepresentation (below 95% and above 105%, respectively) for specific protected characteristics such as age, race/ethnicity, and gender in the trial-eligible population.
CONCLUSIONS: Different scores provide complementary, yet distinct insights and can be utilized to provide a comprehensive assessment of clinical trial representativeness, which is important for projecting clinical benefit in diverse populations.
Code
HPR128
Topic
Health Policy & Regulatory
Topic Subcategory
Health Disparities & Equity
Disease
No Additional Disease & Conditions/Specialized Treatment Areas, Oncology