Identification of Anti-cancer Peptides Based on Multi-classifier System

Author(s): Wanben Zhong, Bineng Zhong*, Hongbo Zhang, Ziyi Chen, Yan Chen

Journal Name: Combinatorial Chemistry & High Throughput Screening
Accelerated Technologies for Biotechnology, Bioassays, Medicinal Chemistry and Natural Products Research

Volume 22 , Issue 10 , 2019

Become EABM
Become Reviewer


Aims and Objective: Cancer is one of the deadliest diseases, taking the lives of millions every year. Traditional methods of treating cancer are expensive and toxic to normal cells. Fortunately, anti-cancer peptides (ACPs) can eliminate this side effect. However, the identification and development of new anti-cancer peptides through experiments take a lot of time and money, therefore, it is necessary to develop a fast and accurate calculation model to identify the anti-cancer peptide. Machine learning algorithms are a good choice.

Materials and Methods: In our study, a multi-classifier system was used, combined with multiple machine learning models, to predict anti-cancer peptides. These individual learners are composed of different feature information and algorithms, and form a multi-classifier system by voting.

Results and Conclusion: The experiments show that the overall prediction rate of each individual learner is above 80% and the overall accuracy of multi-classifier system for anti-cancer peptides prediction can reach 95.93%, which is better than the existing prediction model.

Keywords: Anti-cancer peptides, machine learning, individual learner, feature extraction, multi-classifier system, prediction model.

Al-Benna, S.; Shai, Y.; Jacobsen, F.; Steinstraesser, L. Oncolytic activities of host defense peptides. Int. J. Mol. Sci., 2011, 12(11), 8027-8051.
[] [PMID: 22174648]
Kalyanaraman, B.; Joseph, J.; Kalivendi, S.; Wang, S.; Konorev, E.; Kotamraju, S. Doxorubicin-induced apoptosis: implications in cardiotoxicity. Mol. Cell. Biochem., 2002, 234-235(1-2), 119-124.
[] [PMID: 12162424]
Li, B.; Tang, J.; Yang, Q.; Li, S.; Cui, X.; Li, Y.; Chen, Y.; Xue, W.; Li, X.; Zhu, F. NOREVA: normalization and evaluation of MS-based metabolomics data. Nucleic Acids Res., 2017, 45(W1), W162-W170.
[] [PMID: 28525573]
Gaspar, D.; Veiga, A.S.; Castanho, M.A. From antimicrobial to anticancer peptides. A review. Front. Microbiol., 2013, 4(4), 294.
[] [PMID: 24101917]
Su, R. Deep-Resp-Forest: A deep forest model to predict anti-cancer drug response. Methods (San Diego, Calif.), 2019, 166(15), 91-102.
Liu, H. Group-sparse modeling drug-kinase networks for predicting combinatorial drug sensitivity in cancer cells. Curr. Bioinform., 2018, 13(5), 437-443.
Yu, L. Drug and nondrug classification based on deep learning with various feature selection strategies. Curr. Bioinform., 2018, 13(3), 253-259.
Tang, J.; Fu, J.; Wang, Y.; Luo, Y.; Yang, Q.; Li, B.; Tu, G.; Hong, J.; Cui, X.; Chen, Y.; Yao, L.; Xue, W.; Zhu, F. Simultaneous improvement in the precision, accuracy, and robustness of label-free proteome quantification by optimizing data manipulation chains. Mol. Cell. Proteomics, 2019, 18(8), 1683-1699.
[] [PMID: 31097671]
Hoskin, D.W.; Ramamoorthy, A. Studies on anticancer activities of antimicrobial peptides. BBA - Biomembranes, 2008, 1778(2), 357-375.
Zhong, B. Hierarchical tracking by reinforcement coarse-to-fine verifying. IEEE Trans. Image Process,, 2018. [Epub ahead of print].
[] [PMID: 30530365]
Zhou, Q.; Zhong, B.; Zhang, Y.; Li, J.; Fu, Y. Deep alignment network based multi-person tracking with occlusion and motion reasoning. IEEE Trans. Multimed., 2019, 21(5), 1183-1194.
Zhong, B. Visual tracking via weakly supervised learning from multiple imperfect oracles. Pattern Recognit., 2010, 47(3), 1323-1330.
Lin, Y.; Zhong, B.; Li, G.; Zhao, S.; Chen, Z.; Fan, W. Localization-aware meta tracker guided with adversarial features. IEEE Access, 2019, 7, 99441-99450.
Tyagi, A.; Kapoor, P.; Kumar, R.; Chaudhary, K.; Gautam, A.; Raghava, G.P. In silico models for designing and discovering novel anticancer peptides. Sci. Rep., 2013, 3(10), 2984.
[] [PMID: 24136089]
Suykens, J.A.K.; Vandewalle, J. Least squares support vector machine classifiers. Neural Process. Lett., 1999, 9(3), 293-300.
Hajisharifi, Z.; Piryaiee, M.; Mohammad Beigi, M.; Behbahani, M.; Mohabatkar, H. Predicting anticancer peptides with Chou’s pseudo amino acid composition and investigating their mutagenicity via Ames test. J. Theor. Biol., 2014, 341, 34-40.
[] [PMID: 24035842]
Dong-Sheng, C.; Qing-Song, X.; Yi-Zeng, L. propy: a tool to generate various modes of Chou’s PseAAC. Bioinformatics, 2013, 29(7), 960-962.
Yang, H.; Tang, H.; Chen, X.X.; Zhang, C.J.; Zhu, P.P.; Ding, H.; Chen, W.; Lin, H. Identification of secretory proteins in Mycobacterium tuberculosis using pseudo amino acid composition. BioMed Res. Int., 2016, 20165413903
[] [PMID: 27597968]
Chen, X.X.; Tang, H.; Li, W.C.; Wu, H.; Chen, W.; Ding, H.; Lin, H. Identification of bacterial cell wall lyases via pseudo amino acid composition. BioMed Res. Int., 2016, 20161654623
[] [PMID: 27437396]
Li, Y.H.; Li, X.X.; Hong, J.J.; Wang, Y.X.; Fu, J.B.; Yang, H.; Yu, C.Y.; Li, F.C.; Hu, J.; Xue, W.W.; Jiang, Y.Y.; Chen, Y.Z.; Zhu, F. Clinical trials, progression-speed differentiating features and swiftness rule of the innovative targets of first-in-class drugs. Brief. Bioinform., 2019. Epub ahead of print
[] [PMID: 30689717]
Chen, W.; Ding, H.; Feng, P.; Lin, H.; Chou, K.C. iACP: a sequence-based tool for identifying anticancer peptides. Oncotarget, 2016, 7(13), 16895-16909.
[] [PMID: 26942877]
Lin, H.; Chen, W.; Ding, H. AcalPred: a sequence-based tool for discriminating between acidic and alkaline enzymes. PLoS One, 2013, 8(10)e75726
[] [PMID: 24130738]
Wei, L.; Zhou, C.; Chen, H.; Song, J.; Su, R. ACPred-FL: a sequence-based predictor using effective feature representation to improve the prediction of anti-cancer peptides. Bioinformatics, 2018, 34(23), 4007-4016.
[] [PMID: 29868903]
Zhao, S.; Gao, Y.; Ding, G.; Chua, T.S. Real-time multimedia social event detection in microblog. IEEE Trans. Cybern., 2018, 48(11), 3218-3231.
[] [PMID: 29990033]
Zhao, S. Predicting personalized image emotion perceptions in social networks. IEEE Transactions on Affective Computing, 2016, PP(99), 1-1.
Zhao, S. Approximating discrete probability distribution of image emotions by multi-modal features fusion. In: Twenty-sixth International Joint Conference on Artificial Intelligence; , 2017; pp. 4669-4675.
Zhao, S.; Yao, H.; Gao, Y.; Ji, R.; Ding, G. Continuous probability distribution prediction of image emotions via multi-task shared sparse regression. IEEE Trans. Multimed., 2017, 19(3), 632-645.
Bhasin, M.; Raghava, G.P.S. Classification of nuclear receptors based on amino acid composition and dipeptide composition. J. Biol. Chem., 2004, 279(22), 23262-23266.
[] [PMID: 15039428]
Lin, H.; Chen, W. Prediction of thermophilic proteins using feature selection technique. J. Microbiol. Methods, 2011, 84(1), 67-70.
[] [PMID: 21044646]
Xue, W. What contributes to serotonin-norepinephrine reuptake inhibitors’ dual-targeting mechanism? The key role of transmembrane domain 6 in human serotonin and norepinephrine transporters revealed by molecular dynamics vector machine classifiers. neural simulation. ACS Chem. Neurosci., 2018, 9(5), 1128-1140.
[] [PMID: 29300091]
Tan, J.X.; Li, S.H.; Zhang, Z.M.; Chen, C.X.; Chen, W.; Tang, H.; Lin, H. Identification of hormone binding proteins based on machine learning methods. Math. Biosci. Eng., 2019, 16(4), 2466-2480.
[] [PMID: 31137222]
Zhu, P.P. Predicting the subcellular localization of mycobacterial proteins by incorporating the optimal tripeptides into the general form of pseudo amino acid composition. Mol. Biosyst., 2015, 11(2), 558-563.
[] [PMID: 25437899]
Chen, Z.; Zhao, P.; Li, F.; Leier, A.; Marquez-Lago, T.T.; Wang, Y.; Webb, G.I.; Smith, A.I.; Daly, R.J.; Chou, K.C.; Song, J. iFeature: a Python package and web server for features extraction and selection from protein and peptide sequences. Bioinformatics, 2018, 34(14), 2499-2502.
[] [PMID: 29528364]
Chen, K.; Jiang, Y.; Du, L.; Kurgan, L. Prediction of integral membrane protein type by collocated hydrophobic amino acid pairs. J. Comput. Chem., 2009, 30(1), 163-172.
[] [PMID: 18567007]
Chen, K.; Kurgan, L.; Rahbari, M. Prediction of protein crystallization using collocation of amino acid pairs. Biochem. Biophys. Res. Commun., 2007, 355(3), 764-769.
[] [PMID: 17316561]
Chen, K.; Kurgan, L.A.; Ruan, J. Prediction of flexible/rigid regions from protein sequences using k-spaced amino acid pairs. BMC Struct. Biol., 2007, 7(1), 25-25.
[] [PMID: 17437643]
Ke, C.; Kurgan, L.A.; Jishou, R. Prediction of protein structural class using novel evolutionary collocation-based sequence representation. J. Comput. Chem., 2008, 29(10), 1596-1604.
Fu, J.; Tang, J.; Wang, Y.; Cui, X.; Yang, Q.; Hong, J.; Li, X.; Li, S.; Chen, Y.; Xue, W.; Zhu, F. Discovery of the consistently well-performed analysis chain for SWATH-MS based pharmacoproteomic quantification. Front. Pharmacol., 2018, 9, 681.
[] [PMID: 29997509]
Quinlan, J.R. C4.5: programs for machine learning., 1992.
Thornton, C.; Hutter, F.; Hoos, H.H.; Leyton-Brown, K. Auto- WEKA: Combined Selection and Hyperparameter Optimization of Classification Algorithms, 2013. Technical Report TR-2012-05.
Seung, H.S.; Opper, M.; Sompolinsky, H. Proceedings of the Fifth Annual Workshop on Computational Learning Theory, 1992, pp. 287-294.
Kohavi, R. A study of cross-validation and bootstrap for accuracy estimation and model selection. In: International Joint Conference on Artificial Intelligence; , 1995.
Xu, Y.; Wang, Y.; Luo, J.; Zhao, W.; Zhou, X. Deep learning of the splicing (epi)genetic code reveals a novel candidate mechanism linking histone modifications to ESC fate decision. Nucleic Acids Res., 2017, 45(21), 12100-12112.
[] [PMID: 29036709]
Xu, Y.; Guo, M.; Shi, W.; Liu, X.; Wang, C. A novel insight into gene ontology semantic similarity. Genomics, 2013, 101(6), 368-375.
[] [PMID: 23628645]
Cheng, L.; Jiang, Y.; Ju, H.; Sun, J.; Peng, J.; Zhou, M.; Hu, Y. InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk. BMC Genomics, 2018, 19(Suppl. 1), 919.
[] [PMID: 29363423]
Wang, G.; Li, X.; Wang, Z. APD2: the updated antimicrobial peptide database and its application in peptide design. Nucleic Acids Res.,, 2009, 37(Database), D933-D937.
Fu, L.; Niu, B.; Zhu, Z.; Wu, S.; Li, W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics, 2012, 28(23), 3150-3152.
[] [PMID: 23060610]
Zou, Q.; Lin, G.; Jiang, X.; Liu, X.; Zeng, X. Sequence clustering in bioinformatics: an empirical study. Brief. Bioinform., 2018.
Zhu, X.J.; Feng, Q.; Lai, H.Y.; Chen, W.; Hao, L. Predicting protein structural classes for low-similarity sequences by evaluating different features. Knowl. Base. Syst., 2019, 163, 787-793.
Xu, Y.; Guo, M.; Liu, X.; Wang, C.; Liu, Y.; Liu, G. Identify bilayer modules via pseudo-3D clustering: applications to miRNA-gene bilayer networks. Nucleic Acids Res., 2016, 44(20), e152-e152.
[] [PMID: 27484480]
Xu, Y.; Guo, M.; Liu, X.; Wang, C.; Liu, Y. Inferring the soybean (Glycine max) microRNA functional network based on target gene network. Bioinformatics, 2014, 30(1), 94-103.
[] [PMID: 24149053]
Tang, H.; Chen, W.; Lin, H. Identification of immunoglobulins using Chou’s pseudo amino acid composition with feature selection technique. Mol. Biosyst., 2016, 12(4), 1269-1275.
[] [PMID: 26883492]
Ding, H.; Deng, E.Z.; Yuan, L.F.; Liu, L.; Lin, H.; Chen, W.; Chou, K.C. iCTX-type: a sequence-based predictor for identifying the types of conotoxins in targeting ion channels. BioMed Res. Int., 2014, 2014286419
[] [PMID: 24991545]
Feng, P.M.; Lin, H.; Chen, W. Identification of antioxidants from sequence information using naïve Bayes. Comput. Math. Methods Med., 2013, 2013567529
[] [PMID: 24062796]
Holmes, G.; Donkin, A.; Witten, I.H. WEKA: a machine learning workbench. In: Proceedings of ANZIIS ’94 - Australian New Zealnd Intelligent Information Systems Conference; , 1994.
Li, D.; Ju, Y.; Zou, Q. Protein folds prediction with hierarchical structured SVM. Curr. Proteomics, 2016, 13(2), 79-85.
Wang, S.P. Analysis and prediction of nitrated tyrosine sites with the mRMR method and support vector machine algorithm. Curr. Bioinform., 2018, 13(1), 3-13.
Zhang, N.; Sa, Y.; Guo, Y.; Lin, W.; Wang, P.; Feng, Y. Discriminating ramos and jurkat cells with image textures from diffraction imaging flow cytometry based on a support vector machine. Curr. Bioinform., 2018, 13, 50-56.
Yang, H.; Lv, H.; Ding, H.; Chen, W.; Lin, H. iRNA-2OM: a sequence-based predictor for identifying 2′-O-methylation sites in homo sapiens. J. Comput. Biol., 2018, 25(11), 1266-1277.
[] [PMID: 30113871]
Tang, H.; Zhao, Y.W.; Zou, P.; Zhang, C.M.; Chen, R.; Huang, P.; Lin, H. HBPred: a tool to identify growth hormone-binding proteins. Int. J. Biol. Sci., 2018, 14(8), 957-964.
[] [PMID: 29989085]
Chen, W.; Lv, H.; Nie, F.; Lin, H. i6mA-Pred: identifying DNA N6-methyladenine sites in the rice genome. Bioinformatics, 2019, 35(16), 2796-2800.
[] [PMID: 30624619]
Feng, P-M.; Chen, W.; Lin, H.; Chou, K.C. iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition. Anal. Biochem., 2013, 442(1), 118-125.
[] [PMID: 23756733]
Ding, Y.; Tang, J.; Guo, F. Identification of drug- target interactions via multiple information integration. Inf. Sci., 2017, 418-419, 546-560.
Zeng, X.; Zhang, X.; Zou, Q. Integrative approaches for predicting microRNA function and prioritizing disease-related microRNA using biological interaction networks. Brief. Bioinform., 2016, 17(2), 193-203.
[] [PMID: 26059461]
Zeng, X.; Ding, N.; Rodríguez-Patón, A.; Zou, Q. Probability-based collaborative filtering model for predicting gene-disease associations. BMC Med. Genomics, 2017, 10(5), 76.
[] [PMID: 29297351]
Zhang, X.; Zou, Q.; Rodriguez-Paton, A.; Zeng, X. Meta-path methods for prioritizing candidate disease miRNAs. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 2019, 16(1), 283-291.
[] [PMID: 29990255]
Zou, Q.; Mrozek, D.; Ma, Q.; Xu, Y. Scalable data mining algorithms in computational biology and biomedicine. BioMed Res. Int., 2017, 20175652041
[] [PMID: 28337450]
Zou, Q.; Chen, L.; Huang, T.; Zhang, Z.; Xu, Y. Machine learning and graph analytics in computational biomedicine. Artif. Intell. Med., 2017, 83, 1.
[] [PMID: 28935226]
Xu, Y.; Guo, M.; Liu, X.; Wang, C.; Liu, Y. SoyFN: a knowledge database of soybean functional networks. Database (Oxford), 2014.
[] [PMID: 3949006]
Cheng, L.; Hu, Y.; Sun, J.; Zhou, M.; Jiang, Q. DincRNA: a comprehensive web-based bioinformatics toolkit for exploring disease associations and ncRNA function. Bioinformatics, 2018, 34(11), 1953-1956.
[] [PMID: 29365045]
Lv, H.; Zhang, Z.M.; Li, S.H.; Tan, J.X.; Chen, W.; Lin, H. Evaluation of different computational methods on 5-methylcytosine sites identification. Brief. Bioinform., 2019. [Epub ahead of print]
[PMID: 31157855]
Feng, C.Q. iTerm-PseKNC: a sequence-based tool for predicting bacterial transcriptional terminators. Bioinformatics, 2019, 35(9), 1469-1477.
[] [PMID: 30247625]
Dao, F.Y. Identify origin of replication in Saccharomyces cerevisiae using two-step feature selection technique. Bioinformatics, 2019, 35(12), 2075-2083.
[PMID: 30428009]
Yang, W.; Zhu, X.J.; Huang, J.; Ding, H.; Lin, H. A brief survey of machine learning methods in protein sub-Golgi localization. Curr. Bioinform., 2019, 14, 234-240.
Chen, W.; Yang, H.; Feng, P.; Ding, H.; Lin, H. iDNA4mC: identifying DNA N4-methylcytosine sites based on nucleotide chemical properties. Bioinformatics, 2017, 33(22), 3518-3523.
[] [PMID: 28961687]
Wei, L.; Xing, P.; Zeng, J.; Chen, J.; Su, R.; Guo, F. Improved prediction of protein-protein interactions using novel negative samples, features, and an ensemble classifier. Artif. Intell. Med., 2017, 83, 67-74.
[] [PMID: 28320624]
Wei, L.; Wan, S.; Guo, J.; Wong, K.K. A novel hierarchical selective ensemble classifier with bioinformatics application. Artif. Intell. Med., 2017, 83, 82-90.
[] [PMID: 28245947]
Ding, Y.; Tang, J.; Guo, F. Identification of drug- side effect association via multiple information integration with centered kernel alignment. Neurocomputing, 2019, 325, 211-224.
Jiang, L.; Xiao, Y.; Ding, Y.; Tang, J.; Guo, F. FKL-Spa-LapRLS: an accurate method for identifying human microRNA-disease association. BMC Genomics, 2018, 19(10)
Zeng, X.; Liu, L.; Lü, L.; Zou, Q. Prediction of potential disease-associated microRNAs using structural perturbation method. Bioinformatics, 2018, 34(14), 2425-2432.
[] [PMID: 29490018]
Liu, Y.; Zeng, X.; He, Z.; Zou, Q. Inferring microRNA-disease associations by random walk on a heterogeneous network with multiple data sources. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 2017, 14(4), 905-915.
[] [PMID: 27076459]
Cheng, L.; Zhuang, H.; Yang, S.; Jiang, H.; Wang, S.; Zhang, J. Exposing the causal effect of C-reactive protein on the risk of type 2 diabetes mellitus: a mendelian randomization study. Front. Genet., 2018, 9, 657.
[] [PMID: 30619477]
Cheng, L.; Wang, P.; Tian, R.; Wang, S.; Guo, Q.; Luo, M.; Zhou, W.; Liu, G.; Jiang, H.; Jiang, Q. LncRNA2Target v2.0: a comprehensive database for target genes of lncRNAs in human and mouse. Nucleic Acids Res., 2019, 47(D1), D140-D144.
[] [PMID: 30380072]
Cheng, L.; Sun, J.; Xu, W.; Dong, L.; Hu, Y.; Zhou, M. OAHG: an integrated resource for annotating human genes with multi-level ontologies. Sci. Rep., 2016, 6, 34820.
[] [PMID: 27703231]

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2019
Page: [694 - 704]
Pages: 11
DOI: 10.2174/1386207322666191203141102
Price: $65

Article Metrics

PDF: 30
PRC: 1