Recent Advances in Computational Methods for Identifying Anticancer Peptides

Author(s): Pengmian Feng*, Zhenyi Wang.

Journal Name: Current Drug Targets

Volume 20 , Issue 5 , 2019

  Journal Home
Translate in Chinese
Submit Manuscript
Submit Proposal

Graphical Abstract:


Anticancer peptide (ACP) is a kind of small peptides that can kill cancer cells without damaging normal cells. In recent years, ACP has been pre-clinically used for cancer treatment. Therefore, accurate identification of ACPs will promote their clinical applications. In contrast to labor-intensive experimental techniques, a series of computational methods have been proposed for identifying ACPs. In this review, we briefly summarized the current progress in computational identification of ACPs. The challenges and future perspectives in developing reliable methods for identification of ACPs were also discussed. We anticipate that this review could provide novel insights into future researches on anticancer peptides.

Keywords: Anticancer peptides, disease, cancer, drug target, machine learning methods, sequence encoding scheme.

Torre LA, Bray F, Siegel RL, et al. Global cancer statistics, 2012. CA Cancer J Clin 2015; 65(2): 87-108.
Arnold M, Karim-Kos HE, Coebergh JW, et al. Recent trends in incidence of five common cancers in 26 European countries since 1988: Analysis of the European Cancer Observatory. Eur J Cancer 2015; 51(9): 1164-87.
Tang W, Wan S, Yang Z, Teschendorff AE, Zou Q. Tumor origin detection with tissue-specific miRNA and DNA methylation markers. Bioinformatics 2018; 34(3): 398-406.
Al-Benna S, Shai Y, Jacobsen F, Steinstraesser L. Oncolytic activities of host defense peptides. Int J Mol Sci 2011; 12(11): 8027-51.
Kalyanaraman B, Joseph J, Kalivendi S, et al. Doxorubicin-induced apoptosis: implications in cardiotoxicity. Mol Cell Biochem 2002; 234-235(1-2): 119-24.
Karpinski TM, Adamczak A. Anticancer activity of bacterial proteins and peptides. Pharmaceutics 2018; 10(2)
Vlieghe P, Lisowski V, Martinez J, Khrestchatisky M. Synthetic therapeutic peptides: science and market. Drug Discov Today 2010; 15(1-2): 40-56.
Thundimadathil J. Cancer treatment using peptides: current therapies and future prospects. J Amino Acids 2012; 2012: 967347.
Hoskin DW, Ramamoorthy A. Studies on anticancer activities of antimicrobial peptides. Biochim Biophys Acta 2008; 1778(2): 357-75.
Riedl S, Zweytick D, Lohner K. Membrane-active host defense peptides--challenges and perspectives for the development of novel anticancer drugs. Chem Phys Lipids 2011; 164(8): 766-81.
Wu D, Gao Y, Qi Y, et al. Peptide-based cancer therapy: opportunity and challenge. Cancer Lett 2014; 351(1): 13-22.
Figueiredo CR, Matsuo AL, Massaoka MH, Polonelli L, Travassos LR. Anti-tumor activities of peptides corresponding to conserved complementary determining regions from different immunoglobulins. Peptides 2014; 59: 14-9.
Gaspar D, Freire JM, Pacheco TR, Barata JT, Castanho MA. Apoptotic human neutrophil peptide-1 anti-tumor activity revealed by cellular biomechanics. Biochim Biophys Acta 2015; 1853(2): 308-16.
Huang Y, Feng Q, Yan Q, Hao X, Chen Y. Alpha-helical cationic anticancer peptides: A promising candidate for novel anticancer drugs. Mini Rev Med Chem 2015; 15(1): 73-81.
Gaspar D, Veiga AS, Castanho MA. From antimicrobial to anticancer peptides. A review. Front Microbiol 2013; 4: 294.
Ruiz-Torres V, Encinar JA, Herranz-Lopez M, et al. An updated review on marine anticancer compounds: The use of virtual screening for the discovery of small-molecule cancer drugs. Molecules 2017; 22(7)
Blunden G. Biologically active compounds from marine organisms. Phytotherapy research. PTR 2001; 15(2): 89-94.
Molina-Guijarro JM, Garcia C, Macias A, et al. Elisidepsin interacts directly with glycosylceramides in the plasma membrane of tumor cells to induce necrotic cell death. PLoS One 2015; 10(10): e0140782.
Hariharan S, Gustafson D, Holden SM, et al. Assessment of the biological and pharmacological effects of the alpha nu beta3 and alpha nu beta5 integrin receptor antagonist, cilengitide (EMD 121974), in patients with advanced solid tumors. Ann Oncol 2007; 18(8): 1400-7.
Gregorc V, De Braud FG, De Pas TM, et al. Phase I study of NGR-hTNF, a selective vascular targeting agent, in combination with cisplatin in refractory solid tumors. Clin Cancer Res 2011; 17(7): 1964-72.
Boohaker RJ, Lee MW, Vishnubhotla P, Perez JM, Khaled AR. The use of therapeutic peptides to target and to kill cancer cells. Curr Med Chem 2012; 19(22): 3794-804.
Manavalan B, Shin TH, Lee G. DHSpred: Support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest. Oncotarget 2018; 9(2): 1944-56.
Manavalan B, Shin TH, Lee G. PVP-SVM: Sequence-based prediction of phage virion proteins using a support vector machine. Front Microbiol 2018; 9: 476.
Manavalan B, Lee J. SVMQA: support-vector-machine-based protein single-model quality assessment. Bioinformatics 2017; 33(16): 2496-503.
Manavalan B, Basith S, Shin TH, et al. MLACP: Machine-learning-based prediction of anticancer peptides. Oncotarget 2017; 8(44): 77121-36.
Lin H, Liang ZY, Tang H, Chen W. Identifying sigma70 promoters with novel pseudo nucleotide composition. IEEE/ACM transactions on computational biology and bioinformatics 2017.
Dao FY, Yang H, Su ZD, et al. Recent advances in conotoxin classification by using machine learning methods. Molecules 2017; 22(7)
Cao RZ, Adhikari B, Bhattacharya D, et al. QAcon: single model quality assessment using protein structural and contact information with machine learning techniques. Bioinformatics 2017; 33(4): 586-8.
Cao R, Freitas C, Chan L, et al. ProLanGO: Protein function prediction using neural machine translation based on a recurrent neural network. Molecules 2017; 22(10)
Tang H, Su ZD, Wei HH, Chen W, Lin H. Prediction of cell-penetrating peptides with feature selection techniques. Biochem Biophys Res Commun 2016; 477(1): 150-4.
Tang H, Chen W, Lin H. Identification of immunoglobulins using Chou’s pseudo amino acid composition with feature selection technique. Mole Bio Sys 2016; 12(4): 1269-75.
Cao RZ, Bhattacharya D, Hou J, Cheng JL, Deep QA. Improving the estimation of single protein model quality with deep belief networks. BMC Bioinformatics 2016; 17: 495.
Ding H, Li D. Identification of mitochondrial proteins of malaria parasite using analysis of variance. Amino Acids 2015; 47(2): 329-33.
Cao R, Wang Z, Wang Y, Cheng J. SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines. BMC Bioinformatics 2014; 15: 120.
Kang J, Fang Y, Yao P, et al. NeuroPP: A Tool for the Prediction of Neuropeptide Precursors Based on Optimal Sequence Composition. Interdiscip Sci 2018.
Li N, Kang J, Jiang L, et al. PSBinder: A web service for predicting polystyrene surface-binding peptides. BioMed Res Int 2017; 2017: 5761517.
He B, Kang J, Ru B, et al. SABinder: A web service for predicting streptavidin-binding peptides. BioMed Res Int 2016; 2016: 9175143.
Jia C, Lin X, Wang Z. Prediction of protein S-nitrosylation sites based on adapted normal distribution bi-profile Bayes and Chou’s pseudo amino acid composition. Int J Mol Sci 2014; 15(6): 10410-23.
Zhang J, Zhao X, Sun P, Ma Z. PSNO: predicting cysteine S-nitrosylation sites by incorporating various sequence-derived features into the general form of Chou’s PseAAC. Int J Mol Sci 2014; 15(7): 11204-19.
Xu Y, Shao XJ, Wu LY, Deng NY, Chou KC. iSNO-AAPair: incorporating amino acid pairwise coupling into PseAAC for predicting cysteine S-nitrosylation sites in proteins. PeerJ 2013; 1: e171.
Jia J, Liu Z, Xiao X, Liu B, Chou KC. iCar-PseCp: identify carbonylation sites in proteins by Monte Carlo sampling and incorporating sequence coupled effects into general PseAAC. Oncotarget 2016; 7(23): 34558-70.
Qiu WR, Xiao X, Xu ZC, Chou KC. iPhos-PseEn: identifying phosphorylation sites in proteins by fusing different pseudo components into an ensemble classifier. Oncotarget 2016; 7(32): 51270-83.
Liu LM, Xu Y, Chou KC. iPGK-PseAAC: Identify lysine phosphoglycerylation sites in proteins by incorporating four different tiers of amino acid pairwise coupling information into the general PseAAC. Med Chem 2017; 13(6): 552-9.
Khan YD, Rasool N, Hussain W, Khan SA, Chou KC. iPhosT-PseAAC: Identify phosphothreonine sites by incorporating sequence statistical moments into PseAAC. Analytical Biochem 2018; 550: 109-16.
Chen W, Ding H, Feng P, Lin H, Chou KC. iACP: A sequence-based tool for identifying anticancer peptides. Oncotarget 2016; 7(13): 16895-909.
Hajisharifi Z, Piryaiee M, Mohammad Beigi M, Behbahani M, Mohabatkar H. Predicting anticancer peptides with Chou’s pseudo amino acid composition and investigating their mutagenicity via Ames test. J Theor Biol 2014; 341: 34-40.
Tyagi A, Kapoor P, Kumar R, et al. In silico models for designing and discovering novel anticancer peptides. Sci Rep 2013; 3: 2984.
Akbar S, Hayat M, Iqbal M, Jan MA. iACP-GAEnsC: Evolutionary genetic algorithm based ensemble classification of anticancer peptides by utilizing hybrid feature space. Artif Intell Med 2017; 79: 62-70.
Zhang J, Ju Y, Lu H, Xuan P, Zou Q. Accurate identification of cancerlectins through hybrid machine learning technology. Int J Genomics 2016; 2016: 7604641.
Grisoni F, Neuhaus C, Gabernet G, et al. Designing anticancer peptides by constructive machine learning. ChemMedChem 2018; 13(13): 1300-2.
Chou KC. Some remarks on protein attribute prediction and pseudo amino acid composition. J Theor Biol 2011; 273(1): 236-47.
Qiu WR, Sun BQ, Xiao X, et al. iKcr-PseEns: Identify lysine crotonylation sites in histone proteins with pseudo components and ensemble classifier. Genomics 2017; 110(5): 239-46.
Qiu WR, Jiang SY, Xu ZC, Xiao X, Chou KC. iRNAm5C-PseDNC: identifying RNA 5-methylcytosine sites by incorporating physical-chemical properties into pseudo dinucleotide composition. Oncotarget 2017; 8(25): 41178-88.
Chen W, Feng PM, Deng EZ, Lin H, Chou KC. iTIS-PseTNC: a sequence-based predictor for identifying translation initiation site in human genes using pseudo trinucleotide composition. Anal Biochem 2014; 462: 76-83.
Yang H, Qiu WR, Liu GQ, et al. iRSpot-Pse6NC: Identifying recombination spots in Saccharomyces cerevisiae by incorporating hexamer composition into general PseKNC. Int J Biol Sci 2018; 14(8): 883-91.
Chen W, Feng PM, Lin H, Chou KC. iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition. Nucleic Acids Res 2013; 41(6): e68.
Chen W, Lin H, Feng PM, Ding C, Zuo YC, Chou KC. iNuc-PhysChem: a sequence-based predictor for identifying nucleosomes via physicochemical properties. PLoS One 2012; 7(10): e47843.
Cheng X, Zhao SG, Lin WZ, Xiao X, Chou KC. pLoc-mAnimal: predict subcellular localization of animal proteins with both single and multiple sites. Bioinformatics 2017; 33(22): 3524-31.
Liu B, Yang F, Chou KC. 2L-piRNA: A Two-Layer Ensemble Classifier for Identifying Piwi-Interacting RNAs and Their Function. Mol Ther Nucleic Acids 2017; 7: 267-77.
Cheng X, Xiao X, Chou KC. pLoc-mEuk: Predict subcellular localization of multi-label eukaryotic proteins by extracting the key GO information into general PseAAC. Genomics 2018; 110(1): 50-8.
Tyagi A, Tuknait A, Anand P, et al. CancerPPD: A database of anticancer peptides and proteins. Nucleic Acids Res 2015; 43: D837-43.
Mader JS, Hoskin DW. Cationic antimicrobial peptides as novel cytotoxic agents for cancer treatment. Expert Opin Vestigational Drugs 2006; 15(8): 933-46.
UniProt C. Activities at the universal protein resource (UniProt). Nucleic Acids Res 2014; 42: D191-8.
Cao R, Cheng J. Protein single-model quality assessment by feature-based probability density functions. Sci Rep 2016; 6: 23990.
Feng PM, Ding H, Chen W, Lin H. Naive Bayes classifier with feature selection to identify phage virion proteins. Comput Math Methods Med 2013; 2013: 530696.
Feng PM, Lin H, Chen W. Identification of antioxidants from sequence information using naive Bayes. Comput Math Methods Med 2013; 2013: 567529.
Zou Q, He W. Special protein molecules computational identification. Int J Mol Sci 2018; 19(2): 536.
Chen W, Lin H. Identification of voltage-gated potassium channel subfamilies from sequence information using support vector machine. Comput Biol Med 2012; 42(4): 504-7.
Feng P, Chen W, Lin H. Identifying antioxidant proteins by using optimal dipeptide compositions. Interdiscip Sci 2016; 8(2): 86-91.
Ding H, Deng EZ, Yuan LF, et al. iCTX-type: a sequence-based predictor for identifying the types of conotoxins in targeting ion channels. BioMed Res Int 2014; 2014: 286419.
Wei L, Tang J, Zou Q. SkipCPP-Pred: an improved and promising sequence-based predictor for predicting cell-penetrating peptides. BMC Genomics 2017; 18: 742.
Lai HY, Chen XX, Chen W, Tang H, Lin H. Sequence-based predictive modeling to identify cancerlectins. Oncotarget 2017; 8(17): 28169-75.
Chou KC. Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins 2001; 43(3): 246-55.
Du P, Gu S, Jiao Y. PseAAC-General: fast building various modes of general form of Chou’s pseudo-amino acid composition for large-scale protein datasets. Int J Mol Sci 2014; 15(3): 3495-506.
Lin H, Chen W, Ding H. AcalPred: a sequence-based tool for discriminating between acidic and alkaline enzymes. PLoS One 2013; 8(10): e75726.
Lin H. The modified mahalanobis discriminant for predicting outer membrane proteins by using chou’s pseudo amino acid composition. J Theor Biol 2008; 252(2): 350-6.
Mirny LA, Shakhnovich EI. Universally conserved positions in protein folds: reading evolutionary signals about stability, folding kinetics and function. J Mol Biol 1999; 291(1): 177-96.
Yang H, Tang H, Chen XX, et al. Identification of secretory proteins in mycobacterium tuberculosis using pseudo amino acid composition. BioMed Res Int 2016; 2016: 5413903.
Chen XX, Tang H, Li WC, et al. Identification of bacterial cell wall lyases via pseudo amino acid composition. BioMed Res Int 2016; 2016: 1654623.
Zhu PP, Li WC, Zhong ZJ, et al. Predicting the subcellular localization of mycobacterial proteins by incorporating the optimal tripeptides into the general form of pseudo amino acid composition. Mol Bio Sys 2015; 11(2): 558-63.
Zhao YW, Su ZD, Yang W, et al. IonchanPred 2.0: A tool to predict ion channels and their types. Int J Mol Sci 2017; 18(9): E1838.
Lin H, Liu WX, He J, et al. Predicting cancerlectins by the optimal g-gap dipeptides. Sci Rep 2015; 5: 16964.
Ding H, Feng PM, Chen W, Lin H. Identification of bacteriophage virion proteins by the ANOVA feature selection and analysis. Mol Bio Sys 2014; 10(8): 2229-35.
Tang H, Zou P, Zhang C, et al. Identification of apolipoprotein using feature selection technique. Sci Rep 2016; 6: 30441.
Etchebest C, Benros C, Bornot A, Camproux AC, de Brevern AG. A reduced amino acid alphabet for understanding and designing protein adaptation to mutation. Eur Biophys J 2007; 36(8): 1059-69.
Feng P, Lin H, Chen W, Zuo Y. Predicting the types of J-proteins using clustered amino acids. BioMed Res Int 2014; 2014: 935719.
Chen W, Feng P, Lin H. Prediction of ketoacyl synthase family using reduced amino acid alphabets. J Ind Microbiol Niotechnol 2012; 39(4): 579-84.
Feng PM, Chen W, Lin H, Chou KC. iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition. Anal Biochem 2013; 442(1): 118-25.
Zuo YC, Li QZ. Using reduced amino acid composition to predict defensin family and subfamily: Integrating similarity measure and structural alphabet. Peptides 2009; 30(10): 1788-93.
Kumar R, Chaudhary K, Singh Chauhan J, et al. An in silico platform for predicting, screening and designing of antihypertensive peptides. Sci Rep 2015; 5: 12512.
Chen W, Feng P, Ding H, Lin H, Chou KC. iRNA-Methyl: Identifying N(6)-methyladenosine sites using pseudo nucleotide composition. Anal Biochem 2015; 490: 26-33.
Chen W, Tang H, Ye J, Lin H, Chou KC. iRNA-PseU: Identifying RNA pseudouridine sites. Mol Ther Nucleic Acids 2016; 5: e332.
Chen W, Yang H, Feng P, Ding H, Lin H. iDNA4mC: identifying DNA N4-methylcytosine sites based on nucleotide chemical properties. Bioinformatics 2017; 33(22): 3518-23.
Chen W, Feng P, Yang H, et al. iRNA-3typeA: identifying 3-types of modification at RNA’s adenosine sites. Mol Ther Nucleic Acids 2018; 11: 468-74.
Feng P, Yang H, Ding H, et al. iDNA6mA-PseKNC: Identifying DNA N6-methyladenosine sites sites by incorporating nucleotide physicochemical properties into PseKNC. Genomics 2019; 11(1): 96-102.
Chen W, Feng PM, Lin H, Chou KC. iSS-PseDNC: identifying splicing sites using pseudo dinucleotide composition. BioMed Res Int 2014; 2014: 623149.
Feng P, Ding H, Yang H, et al. iRNA-PseColl: Identifying the occurrence sites of different RNA modifications by incorporating collective effects of nucleotides into PseKNC. Mol Ther Nucleic Acids 2017; 7: 155-63.
Chen W, Xing P, Zou Q. Detecting N(6)-methyladenosine sites from RNA transcriptomes using ensemble Support Vector Machines. Sci Rep 2017; 7: 40242.
Jia C, Zuo Y, Zou Q, Hancock J. O-GlcNAcPRED-II: an integrated classification algorithm for identifying O-GlcNAcylation sites based on fuzzy undersampling and a K-means PCA oversampling technique. Bioinformatics 2018; 34(12): 2029-36.
Wan S, Duan Y, Zou Q. HPSLPred: An Ensemble multi-label classifier for human protein subcellular location prediction with imbalanced source. Proteomics 2017; 17: 1700262.
Chou KC, Shen HB. Recent advances in developing web-servers for predicting protein attributes. Nat Sci 2009; 1: 63-92.
Liu B, Yang F, Huang DS, Chou KC. iPromoter-2L: a two-layer predictor for identifying promoters and their types by multi-window-based PseKNC. Bioinformatics 2018; 34(1): 33-40.
Liang ZY, Lai HY, Yang H, et al. Pro54DB: a database for experimentally verified sigma-54 promoters. Bioinformatics 2017; 33(3): 467-9.
Chen W, Zhang X, Brooker J, Lin H, Zhang L, Chou KC. PseKNC-General: a cross-platform package for generating various modes of pseudo nucleotide compositions. Bioinformatics 2015; 31(1): 119-20.
Feng P, Ding H, Lin H, Chen W. AOD: the antioxidant protein database. Sci Rep 2017; 7(1): 7449.
Chen W, Lei TY, Jin DC, Lin H, Chou KC. PseKNC: a flexible web server for generating pseudo K-tuple nucleotide composition. Anal Biochem 2014; 456: 53-60.
He B, Jiang L, Duan Y, et al. Biopanning data bank 2018: hugging next generation phage display. Database 2018 2018.
Dong C, Hao GF, Hua HL, et al. Anti-CRISPRdb: a comprehensive online resource for anti-CRISPR proteins. Nucleic Acids Res 2018; 46(D1): D393-8.
He B, Chai G, Duan Y, et al. BDB: biopanning data bank. Nucleic Acids Res 2016; 44(D1): D1127-32.
Huang J, Ru B, Zhu P, et al. MimoDB 2.0: a mimotope database and beyond. Nucleic Acids Res 2012; 40: D271-7. PMID:

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2019
Page: [481 - 487]
Pages: 7
DOI: 10.2174/1389450119666180801121548
Price: $58

Article Metrics

PDF: 27