Background: Cancerlectins play an important role in various cancer metastasis and tumor
cell differentiation. Therefore, comprehensively understanding the functions of cancerlectins could reveal
the future direction of cancer treatment. Although cancerlectin protein sequences can be distinguished
by various computational methods, which have been proposed as auxiliary tools, these methods
sometimes fail because of the large sequence diversity among cancerlectins.
Objective: The objective of this study is to provide an efficient predictor for identifying cancerlectins.
Method: Herein, we build a prediction model based on a support vector machine, which improves the
sensitivity and accuracy of cancerlectin protein identification. Feature extraction and selection are performed
by our proposed Split Bi-Profile Bayes (SBPB) scheme and a lasso algorithm, respectively.
Results: In jackknife cross-validation, our model (called iCanLec-SBPB) achieved a sensitivity of
81.36% and an accuracy of 83.25%.
Conclusion: The results confirm the higher sensitivity and accuracy of iCanLec-SBPB than other existing