Towards Trustworthy and Equitable Healthcare AI: Defining Fairness Domains and Metrics for Consensus

Author(s)

Hyun Jin Han, MBA, MPH, PhD1, Shinyoung Park, BSc2, Jo Soomi, BSc2, Hae Sun Suh, MA, MS, PhD1.
1College of Pharmacy, Kyung Hee University, Seoul, Korea, Republic of, 2Department of Regulatory Science, Kyung Hee University, Seoul, Korea, Republic of.
OBJECTIVES: This study synthesizes the literature on fairness in healthcare AI through an umbrella review and examines regulatory perspectives from the MFDS, FDA, and EMA on trustworthy AI, aiming to propose a collaborative framework.
METHODS: We conducted an umbrella review and a literature review of regulatory publications. We searched databases including MEDLINE, CENTRAL, EMBASE, IEEE Xplore, and the ACM Digital Library for studies on trustworthiness and fairness in AI applied to healthcare through July 2024. Keywords used were “artificial intelligence (AI),” “machine learning,” “supervised learning,” “unsupervised learning,” “reinforcement learning,” “deep learning,” “trustworthy,” “fairness,” and “health.” Data were charted based on five properties of trustworthy AI: fairness, explainability, transparency, safety, and robustness. Articles addressing fairness from systematic or scoping reviews were synthesized. In parallel, regulatory publications from the Korean MFDS, U.S. FDA, and EMA were reviewed using the keywords “artificial intelligence,” “trustworthy,” and “medical,” and underwent descriptive and thematic analysis.
RESULTS: Out of 933 papers identified, 13 met the inclusion criteria. The umbrella review defined fairness in healthcare AI across four dimensions: systemic, outcome, treatment, and process fairness. Ten regulatory publications emphasized fairness, explainability, transparency, and ethics, with fairness and explainability consistently highlighted across agencies. AI fairness can be measured by addressing data, algorithm, and clinical biases. Key measurement targets include minority bias, informative bias, training-serving bias, label bias, cohort bias, automation bias, feedback loops, rejection bias, allocation discrepancies, privilege bias, informed mistrust, and agency bias, using calibration-based, score-based, confusion matrix-based, and parity-based metrics.
CONCLUSIONS: Our findings reveal that fairness in healthcare AI encompasses four key dimensions, with regulatory bodies consistently emphasizing fairness and explainability. These insights provide a foundation for developing transparent, ethical, and equitable AI applications in healthcare, and underscore the need for a collaborative regulatory framework to address the challenges in AI fairness.

Conference/Value in Health Info

2025-09, ISPOR Real-World Evidence Summit 2025, Tokyo, Japan

Value in Health Regional, Volume 49S (September 2025)

Code

RWD166

Topic Subcategory

Data Protection, Integrity, & Quality Assurance

Disease

STA: Personalized & Precision Medicine

Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×