Benchmarking Clinical Trial Diversity: Methods to Characterize Eligibility-Based Representativeness

Speaker(s)

Benedum C¹, Griffith SD¹, Bozkurt S², Sarkar S¹
¹Flatiron Health, New York, NY, USA, ²Emory University, Atlanta, GA, USA

Presentation Documents

ISPOR 2024_POS_D5_5.00_25Apr24_HQ135062.pdf

OBJECTIVES: Regulators and payers have highlighted the need for inclusive and diverse clinical trials to ensure that drugs provide clinical benefit and value to the full approved indication population. Therefore, it is important to benchmark the inclusiveness of trials via measuring the representativeness of the study population. While multiple methods from various scientific disciplines exist for assessing representativeness, they have not been compared directly, and no recommendations exist for which methods to apply in practice. We aim to describe methods for assessing the representativeness of clinical trial eligibility criteria and to evaluate their utility when applied to real-world examples.

METHODS: We compared three representativeness scoring methods from three fields: log disparity (LD; machine learning fairness), GIST (bioinformatics), and a propensity score (PS)-based metric (statistics). LD summarizes representativeness for a specific characteristic, while GIST and PS summarize representativeness across many characteristics. Using real-world data from the nationwide Flatiron Health electronic health record-derived de-identified database as a benchmark (N=50,263), we selected patients who met trial eligibility and therapy indication criteria for nine cross-sponsor phase III advanced non-small cell lung cancer trials.

RESULTS: GIST and PS-based metrics ranged between 0.17–0.67 and 1.07–2.74, respectively, suggesting low to moderate representativeness according to field-specific benchmarks. GIST and PS metrics were strongly correlated (ρ=-0.90); however, PS included additional information on clinical and demographic characteristics not incorporated into GIST, which is based on eligibility criteria alone. LD analysis, with scores ranging from 37%–201%, offered unique and valuable insights. It provided complementary but more granular perspectives, estimating the magnitude of under/overrepresentation (below 95% and above 105%, respectively) for specific protected characteristics such as age, race/ethnicity, and gender in the trial-eligible population.

CONCLUSIONS: Different scores provide complementary, yet distinct insights and can be utilized to provide a comprehensive assessment of clinical trial representativeness, which is important for projecting clinical benefit in diverse populations.

Code

HPR128

Topic

Health Policy & Regulatory

Topic Subcategory

Health Disparities & Equity

Disease

No Additional Disease & Conditions/Specialized Treatment Areas, Oncology

ISPOR 2024

May 5-8, 2024