Model with the GBDT for Colorectal Adenoma Risk Diagnosis

(E-pub Ahead of Print)

Author(s): Junbo Gao*, Lifeng Zhang, Gaiqing Yu, Guoqiang Qu, Yanfeng Li, Xuebing Yang

Journal Name: Current Bioinformatics

Become EABM
Become Reviewer


Background and objective: Colorectal cancer (CRC) is a common malignant tumor of the digestive system; it is associated with high morbidity and mortality. However, an early prediction of colorectal adenoma (CRA) that is a precancerous disease of most CRC patients provides an opportunity to make an appropriate strategy for prevention, early diagnosis and treatment. We aimed to build a machine learning model to predict CRA that could assist physicians in classifying high-risk patients and make informed choices, prevent CRC.

Methods: We instructed patients who had undergone a colonoscopy to fill out a questionnaire at the Sixth People Hospital of Shanghai in China from July 2018 to November 2018. A classification model with the gradient boosting decision tree (GBDT) was developed to predict CRA. This model was compared with three other models, namely, random forest (RF), support vector machine (SVM), and logistic regression (LR). The area under the receiver operating characteristic curve (AUC) was used to evaluate performance of the models.

Results: Among the 245 included patients, 65 patients had CRA. The area under the receiver operating characteristic (AUCs) of GBDT, RF, SVM ,and LR with 10 fold-cross validation were 0.8131, 0.74, 0.769 and 0.763. We also built an online prediction service, CRA Inference System, to substantialize the proposed solution for patients with CRA.

Conclusion: We developed and compared four classification models for CRA prediction, and the GBDT model showed the highest performance. Implementing a GBDT model for screening can reduce the cost of time and money and help physicians identify high-risk groups for primary prevention.

Keywords: Colorectal adenoma, Colorectal cancer, Gradient boosting decision tree, Prediction, Clinical data, Early prevention

Rights & PermissionsPrintExport Cite as

Article Details

(E-pub Ahead of Print)
DOI: 10.2174/1574893614666191120142005
Price: $95