A MACHINE LEARNING APPROACH FOR EARLY DETECTION OF POLYCYSTIC OVARY SYNDROME (PCOS): ENHANCING DIAGNOSTIC EFFICIENCY USING EXTRA GRADIENT BOOST (XGBOOST)

Author(s)

Tanha T. Ahmed, BSc (Hons), Dania Al-Dulaimy, BSc (Hons);
WEP Clinical, London, United Kingdom
OBJECTIVES: Polycystic ovary syndrome (PCOS) is one of the most common endocrine disorders, characterized by multifactorial symptoms, overlapping clinical features, and complexity in diagnosis. Standard diagnosis pathways are often based on the Rotterdam criteria which include combinations of clinical hyperandrogenism, oligo-anovulation, and polycystic ovarian morphology, but may not be reliable given the heterogeneity of PCOS. This study assesses the diagnostic performance of XGBoost models in PCOS classification in terms of accuracy and potential time efficiency.
METHODS: Peer-reviewed literature up to December 2025 was consolidated for studies evaluating XGBoost-based models against standard clinical diagnostic criteria for PCOS. Studies were included if they reported performance metrics such as area under the curve (AUC), accuracy, precision, and feature importance.
RESULTS: In one study, XGBoost consistently outperformed traditional machine learning (ML) classifiers. When clinical and ultrasound data were integrated into the ML models, the model achieved an AUC of 0.9852 and accuracy of 0.9384. Including Anti-Müllerian Hormone (AMH) enhanced performance further (AUC = 0.9947; accuracy = 0.9553). Key diagnostic contributors identified through feature importance analysis included follicle counts, weight gain, AMH levels, hair growth, and menstrual irregularity. External validation on an independent dataset achieved perfect metrics (AUC = 1.0 and precision = 1.0), although this was in a small validation dataset. A hybrid model utilizing XGBoost and VGGNet-19 also found 99.6% accuracy in PCOS diagnosis, when assessing a dataset of 2,004 images. Additionally, another study in 1,600 women at a Chinese hospital showed that XGBoost showed excellent predictive accuracy for early detection of PCOS (AUC = 0.919).
CONCLUSIONS: XGBoost-based PCOS diagnostic models show exceptional performance, suggesting potential to streamline diagnosis. However, larger multicenter validation and integration into clinical settings are needed before routine use. Future guidelines should consider incorporating validated ML tools to improve early detection and care.

Conference/Value in Health Info

2026-05, ISPOR 2026, Philadelphia, PA, USA

Value in Health, Volume 29, Issue S6

Code

MT5

Topic

Medical Technologies

Disease

SDC: Diabetes/Endocrine/Metabolic Disorders (including obesity), SDC: Reproductive & Sexual Health

Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×