OPTIMIZATION OF PROSTATE CANCER BIOPSY DECISION-MAKING IN KOREAN PATIENTS USING MACHINE LEARNING
Author(s)
Ji-Eun An, Sr., MS1, Su-Yeon Yu, Phd1, Min-Ji Rho, Ph.D. program1, Sanghwa Lee, Ph.D. program1, Jae-heung Jung, MD2;
1kangwon national university, Chuncheon, Korea, Republic of, 2Wonju Severance Christian Hospital, Wonju, Korea, Republic of
1kangwon national university, Chuncheon, Korea, Republic of, 2Wonju Severance Christian Hospital, Wonju, Korea, Republic of
OBJECTIVES: Prostate cancer incidence in Korea has tripled over the past decade, ranking as the second most common malignancy in men aged ≥65 years. Prostate-Specific Antigen (PSA) screening facilitates early detection but suffers from low specificity, leading to unnecessary biopsies and associated complications. This study aims to identify prostate cancer biomarkers and develop predictive models for cancer diagnosis and biopsy-related complications in Korean patients.
METHODS: This retrospective cohort study analyzed data from 17,530 Korean male patients who underwent prostate biopsies (2011-2019) across six tertiary hospitals. Variables included sociodemographic (age, smoking, family history), clinical (PSA, Gleason Score, TNM stage, DRE, MRI), and outcome data (cancer diagnosis, biopsy complications). Statistical analyses (chi-square, t-test, ANOVA, LASSO regression) identified significant predictors. Predictive models (logistic regression, SVM, Random Forest, XGBoost, MLP, soft voting ensemble) were trained using an 80:20 dataset split with 10-fold cross-validation and hyperparameter tuning.
RESULTS: PSA, Gleason Score, age, and family history were significant cancer predictors. The soft voting ensemble achieved the best diagnostic performance (AUC 0.840, precision 0.863, F1-score 0.647). For borderline PSA patients, performance declined but improved with additional features. Biopsy complication prediction was limited by class imbalance, yet combining Random Over Sampling with focal loss improved recall (0.476) and AUC (0.660). A web-based dashboard integrated SHAP analysis and ChatGPT for interpretability.
CONCLUSIONS: This study supports prostate cancer diagnosis and biopsy decision-making in Korean patients. Future work should refine predictive models for complication severity stratification.
METHODS: This retrospective cohort study analyzed data from 17,530 Korean male patients who underwent prostate biopsies (2011-2019) across six tertiary hospitals. Variables included sociodemographic (age, smoking, family history), clinical (PSA, Gleason Score, TNM stage, DRE, MRI), and outcome data (cancer diagnosis, biopsy complications). Statistical analyses (chi-square, t-test, ANOVA, LASSO regression) identified significant predictors. Predictive models (logistic regression, SVM, Random Forest, XGBoost, MLP, soft voting ensemble) were trained using an 80:20 dataset split with 10-fold cross-validation and hyperparameter tuning.
RESULTS: PSA, Gleason Score, age, and family history were significant cancer predictors. The soft voting ensemble achieved the best diagnostic performance (AUC 0.840, precision 0.863, F1-score 0.647). For borderline PSA patients, performance declined but improved with additional features. Biopsy complication prediction was limited by class imbalance, yet combining Random Over Sampling with focal loss improved recall (0.476) and AUC (0.660). A web-based dashboard integrated SHAP analysis and ChatGPT for interpretability.
CONCLUSIONS: This study supports prostate cancer diagnosis and biopsy decision-making in Korean patients. Future work should refine predictive models for complication severity stratification.
Conference/Value in Health Info
2026-05, ISPOR 2026, Philadelphia, PA, USA
Value in Health, Volume 29, Issue S6
Code
MSR176
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
SDC: Oncology, SDC: Urinary/Kidney Disorders, STA: Personalized & Precision Medicine