Development of Machine Learning Models for Predicting Prostate Cancer in Biopsy Candidates Using Prostate-Specific Antigen, Magnetic Resonance Imaging, and Hematologic Parameters

Scritto il 30/03/2026
da Deniz Noyan Özlü

Prostate. 2026 Mar 30. doi: 10.1002/pros.70168. Online ahead of print.

ABSTRACT

INTRODUCTION: Prostate-specific antigen (PSA) alone is insufficient for the diagnosis of prostate cancer (PCa), particularly within the gray zone range of 4-10 ng/mL. Multiparametric magnetic resonance imaging (mpMRI), although widely used, has notable limitations in the diagnostic pathway. The aim of this study was to develop a biopsy prediction model by evaluating multiple machine learning (ML) algorithms incorporating PSA-related variables, mpMRI findings, and hematologic parameters.

MATERIALS AND METHODS: This study included patients who underwent either systematic biopsy or combined biopsy (systematic plus fusion) based on mpMRI findings and had pre-biopsy PSA levels ≤ 10 ng/mL between 2017 and 2024 at our center. Laboratory findings, mpMRI results, and prostate biopsy outcomes were recorded. Based on the pathological evaluation of biopsy specimens, the patients were divided into two groups: those without malignancy (Group 1) and those with malignancy (Group 2). To develop a useful model for prediction, five ML techniques were applied: logistic regression, random forest (RF), extra trees, extreme gradient boosting (XGBoost) classifier, and light gradient-boosting machine classifier.

RESULTS: There were 1223 patients (84.5%) in Group 1 and 225 patients (15.5%) in Group 2. The model with the best accuracy was XGBoost, with a sensitivity of 94.74% and a specificity of 100% on the test data set. For the RF model, sensitivity and specificity on the test data set were determined to be 78.95% and 100%, respectively. The area under the curve (AUC) values for XGBoost and RF were 0.97 and 0.89, respectively. According to the permutation feature importance analysis, the three most influential variables were free/total PSA, Prostate Imaging-Reporting and Data System Score 4, and platelet-to-lymphocyte ratio.

CONCLUSION: XGBoost and RF models demonstrated excellent performance, with high AUC values. To generalize the results, it is necessary to confirm the accuracy of ML models through external validation in different populations.

PMID:41911495 | DOI:10.1002/pros.70168