NeuroCS: A Tool to Predict Cleavage Sites of Neuropeptide Precursors

Author(s): Ying Wang, Juanjuan Kang, Ning Li, Yuwei Zhou, Zhongjie Tang, Bifang He, Jian Huang*.

Journal Name: Protein & Peptide Letters

Volume 27 , Issue 4 , 2020

Become EABM
Become Reviewer

Graphical Abstract:


Abstract:

Background: Neuropeptides are a class of bioactive peptides produced from neuropeptide precursors through a series of extremely complex processes, mediating neuronal regulations in many aspects. Accurate identification of cleavage sites of neuropeptide precursors is of great significance for the development of neuroscience and brain science.

Objective: With the explosive growth of neuropeptide precursor data, it is pretty much needed to develop bioinformatics methods for predicting neuropeptide precursors’ cleavage sites quickly and efficiently.

Methods: We started with processing the neuropeptide precursor data from SwissProt and NueoPedia into two sets of data, training dataset and testing dataset. Subsequently, six feature extraction schemes were applied to generate different feature sets and then feature selection methods were used to find the optimal feature subset of each. Thereafter the support vector machine was utilized to build models for different feature types. Finally, the performance of models were evaluated with the independent testing dataset.

Results: Six models are built through support vector machine. Among them the enhanced amino acid composition-based model reaches the highest accuracy of 91.60% in the 5-fold cross validation. When evaluated with independent testing dataset, it also showed an excellent performance with a high accuracy of 90.37% and Area under Receiver Operating Characteristic curve up to 0.9576.

Conclusion: The performance of the developed model was decent. Moreover, for users’ convenience, an online web server called NeuroCS is built, which is freely available at http://i.uestc.edu.cn/NeuroCS/dist/index.html#/. NeuroCS can be used to predict neuropeptide precursors’ cleavage sites effectively.

Keywords: Cleavage sites, enhanced amino acid composition, machine learning, neuropeptide, neuropeptide precursor, support vector machine.

[1]
Hoyer, D.; Bartfai, T. Neuropeptides and neuropeptide receptors: Drug targets, and peptide and non-peptide ligands: A tribute to Prof. Dieter Seebach. Chem. Biodivers., 2012, 9(11), 2367-2387.
[http://dx.doi.org/10.1002/cbdv.201200288] [PMID: 23161624]
[2]
Russo, A.F. Overview of neuropeptides: Awakening the senses? Headache, 2017, 57(Suppl. 2), 37-46.
[http://dx.doi.org/10.1111/head.13084] [PMID: 28485842]
[3]
Veenstra, J.A. Neuropeptide evolution: Neurohormones and neuropeptides predicted from the genomes of Capitella teleta and Helobdella robusta. Gen. Comp. Endocrinol., 2011, 171(2), 160-175.
[http://dx.doi.org/10.1016/j.ygcen.2011.01.005] [PMID: 21241702]
[4]
Derst, C.; Dircksen, H.; Meusemann, K.; Zhou, X.; Liu, S.; Predel, R. Evolution of neuropeptides in non-pterygote hexapods. BMC Evol. Biol., 2016, 16, 51.
[http://dx.doi.org/10.1186/s12862-016-0621-4] [PMID: 26923142]
[5]
Kang, J.; Fang, Y.; Yao, P.; Li, N.; Tang, Q.; Huang, J.; Neuro, P.P. A tool for the prediction of neuropeptide precursors based on optimal sequence composition. Interdiscip. Sci., 2018, 11(1), 108-114.
[http://dx.doi.org/10.1007/s12539-018-0287-2] [PMID: 29525981]
[6]
Chou, K.C. Prediction of protein signal sequences and their cleavage sites. Proteins, 2001, 42(1), 136-139.
[http://dx.doi.org/10.1002/1097-0134(20010101)42:1<136:AID-PROT130>3.0.CO;2-F] [PMID: 11093267]
[7]
Chai, G.; Yu, M.; Jiang, L.; Duan, Y.; Huang, J. HMMCAS: A web tool for the identification and domain annotations of Cas proteins. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 2019, 16(4), 1313-1315.
[http://dx.doi.org/10.1109/TCBB.2017.2665542] [PMID: 28186905]
[8]
Zhang, Y.; Liu, T.; Chen, L.; Yang, J.; Yin, J.; Zhang, Y.; Yun, Z.; Xu, H.; Ning, L.; Guo, F.; Jiang, Y.; Lin, H.; Wang, D.; Huang, Y.; Huang, J. RIscoper: A tool for RNA-RNA interaction extraction from the literature. Bioinformatics, 2019, 35(17), 3199-3202.
[http://dx.doi.org/10.1093/bioinformatics/btz044] [PMID: 30668649]
[9]
He, B.; Chai, G.; Duan, Y.; Yan, Z.; Qiu, L.; Zhang, H.; Liu, Z.; He, Q.; Han, K.; Ru, B.; Guo, F.B.; Ding, H.; Lin, H.; Wang, X.; Rao, N.; Zhou, P.; Huang, J. BDB: Biopanning data bank. Nucleic Acids Res., 2016, 44(D1), D1127-D1132.
[http://dx.doi.org/10.1093/nar/gkv1100] [PMID: 26503249]
[10]
He, B.; Jiang, L.; Duan, Y.; Chai, G.; Fang, Y.; Kang, J.; Yu, M.; Li, N.; Tang, Z.; Yao, P.; Wu, P.; Derda, R.; Huang, J. Biopanning data bank 2018: Hugging next generation phage display. Database (Oxford)., 2018, 2018
[http://dx.doi.org/10.1093/database/bay032] [PMID: 29688378]
[11]
Ning, L.; He, B.; Zhou, P.; Derda, R.; Huang, J. Molecular design of peptide-Fc fusion drugs. Curr. Drug Metab., 2019, 20(3), 203-208.
[http://dx.doi.org/10.2174/1389200219666180821095355] [PMID: 30129406]
[12]
Ning, L.; Li, Z.; Bai, Z.; Hou, S.; He, B.; Huang, J.; Zhou, P. Computational design of antiangiogenic peptibody by fusing human IgG1 Fc fragment and HRH Peptide: Structural modeling, energetic analysis, and dynamics simulation of its binding potency to VEGF receptor. Int. J. Biol. Sci., 2018, 14(8), 930-937.
[http://dx.doi.org/10.7150/ijbs.24582] [PMID: 29989101]
[13]
He, B.; Dzisoo, A.M.; Derda, R.; Huang, J. Development and application of computational methods in phage display technology. Curr. Med. Chem., 2019, 26(42), 7672-7693.
[http://dx.doi.org/10.2174/0929867325666180629123117] [PMID: 29956612]
[14]
Southey, B.R.; Amare, A.; Zimmerman, T.A.; Rodriguez-Zas, S.L.; Sweedler, J.V. NeuroPred: A tool to predict cleavage sites in neuropeptide precursors and provide the masses of the resulting peptides. Nucleic Acids Res., 2006, 34(Web Server issue), W267-72.
[http://dx.doi.org/10.1093/nar/gkl161] [PMID: 16845008]
[15]
Liu, F.; Wets, G. A neural network method for prediction of proteolytic cleavage sites in neuropeptide precursors. Conf. Proc. IEEE Eng. Med. Biol. Soc., 2005, 3, 2805-2808.
[http://dx.doi.org/10.1109/IEMBS.2005.1617056] [PMID: 17282825]
[16]
Boutet, E.; Lieberherr, D.; Tognolli, M.; Schneider, M.; Bansal, P.; Bridge, A.J.; Poux, S.; Bougueleret, L.; Xenarios, I. UniProtKB/Swiss-Prot, the manually annotated section of the UniProt knowledge base: How to use the entry view. Methods Mol. Biol., 2016, 1374, 23-54.
[http://dx.doi.org/10.1007/978-1-4939-3167-5_2] [PMID: 26519399]
[17]
Kim, Y.; Bark, S.; Hook, V.; Bandeira, N. NeuroPedia: Neuropeptide database and spectral library. Bioinformatics, 2011, 27(19), 2772-2773.
[http://dx.doi.org/10.1093/bioinformatics/btr445] [PMID: 21821666]
[18]
Chen, Z.; Zhao, P.; Li, F.; Leier, A.; Marquez-Lago, T.T.; Wang, Y.; Webb, G.I.; Smith, A.I.; Daly, R.J.; Chou, K.C.; Song, J. iFeature: A Python package and web server for features extraction and selection from protein and peptide sequences. Bioinformatics, 2018, 34(14), 2499-2502.
[http://dx.doi.org/10.1093/bioinformatics/bty140] [PMID: 29528364]
[19]
Basith, S.; Manavalan, B.; Shin, T.H.; Lee, G. iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree. Comput. Struct. Biotechnol. J., 2018, 16, 412-420.
[http://dx.doi.org/10.1016/j.csbj.2018.10.007] [PMID: 30425802]
[20]
Wei, L.; Su, R.; Luan, S.; Liao, Z.; Manavalan, B.; Zou, Q.; Shi, X. Iterative feature representations improve N4-methylcytosine site prediction. Bioinformatics, 2019, 35(23), 4930-4937.
[http://dx.doi.org/10.1093/bioinformatics/btz408] [PMID: 31099381]
[21]
Boopathi, V.; Subramaniyam, S.; Malik, A.; Lee, G.; Manavalan, B.; Yang, D.C. mACPpred: A support vector machine-based meta-predictor for identification of anticancer peptides. Int. J. Mol. Sci., 2019, 20(8) E1964
[http://dx.doi.org/10.3390/ijms20081964] [PMID: 31013619]
[22]
Manavalan, B.; Shin, T.H.; Lee, G. PVP-SVM: Sequence-based prediction of phage virion proteins using a support vector machine. Front. Microbiol., 2018, 9, 476.
[http://dx.doi.org/10.3389/fmicb.2018.00476] [PMID: 29616000]
[23]
Ding, H.; Feng, P-M.; Chen, W.; Lin, H. Identification of bacteriophage virion proteins by the ANOVA feature selection and analysis. Mol. Biosyst., 2014, 10(8), 2229-2235.
[http://dx.doi.org/10.1039/c4mb00316k] [PMID: 24931825]
[24]
Liu, H.; Setiono, R. Incremental feature selection. Appl. Intell., 1998, 9(3), 217-230.
[http://dx.doi.org/10.1023/A:1008363719778]
[25]
Huang, T.; Niu, S.; Xu, Z.; Huang, Y.; Kong, X.; Cai, Y.D.; Chou, K.C. Predicting transcriptional activity of multiple site p53 mutants based on hybrid properties. PLoS One, 2011, 6(8) e22940
[http://dx.doi.org/10.1371/journal.pone.0022940] [PMID: 21857971]
[26]
Jiang, Y.; Huang, T.; Chen, L.; Gao, Y.F.; Cai, Y.; Chou, K.C. Signal propagation in protein interaction network during colorectal cancer progression. BioMed Res. Int., 2013, 2013 287019
[http://dx.doi.org/10.1155/2013/287019] [PMID: 23586028]
[27]
Manavalan, B.; Basith, S.; Shin, T.H.; Wei, L.; Lee, G. Meta-4mCpred: A sequence-based meta-predictor for accurate DNA 4mC site prediction using effective feature representation. Mol. Ther. Nucleic Acids, 2019, 16, 733-744.
[http://dx.doi.org/10.1016/j.omtn.2019.04.019] [PMID: 31146255]
[28]
Manavalan, B.; Shin, T.H.; Kim, M.O.; Lee, G. PIP-EL: A new ensemble learning method for improved proinflammatory peptide predictions. Front. Immunol., 2018, 9, 1783.
[http://dx.doi.org/10.3389/fimmu.2018.01783] [PMID: 30108593]
[29]
Manavalan, B.; Subramaniyam, S.; Shin, T.H.; Kim, M.O.; Lee, G. Machine-learning-based prediction of cell-penetrating peptides and their uptake efficiency with improved accuracy. J. Proteome Res., 2018, 17(8), 2715-2726.
[http://dx.doi.org/10.1021/acs.jproteome.8b00148] [PMID: 29893128]
[30]
Dao, F.Y.; Lv, H.; Wang, F.; Feng, C.Q.; Ding, H.; Chen, W.; Lin, H. Identify origin of replication in Saccharomyces cerevisiae using two-step feature selection technique. Bioinformatics, 2019, 35(12), 2075-2083.
[http://dx.doi.org/10.1093/bioinformatics/bty943] [PMID: 30428009]
[31]
Manavalan, B.; Govindaraj, R.G.; Shin, T.H.; Kim, M.O.; Lee, G. iBCE-EL: A new ensemble learning framework for improved linear B-Cell epitope prediction. Front. Immunol., 2018, 9, 1695.
[http://dx.doi.org/10.3389/fimmu.2018.01695] [PMID: 30100904]
[32]
Chen, W.; Lv, H.; Nie, F.; Lin, H. i6mA-Pred: Identifying DNA N6-methyladenine sites in the rice genome. Bioinformatics, 2019, 35(16), 2796-2800.
[http://dx.doi.org/10.1093/bioinformatics/btz015] [PMID: 30624619]
[33]
Lv, H.; Zhang, Z.M.; Li, S.H.; Tan, J.X.; Chen, W.; Lin, H. Evaluation of different computational methods on 5-methylcytosine sites identification. Brief. Bioinform., 2019. [Epub ahead of print]
[PMID: 31157855]
[34]
Subasi, A.; Ismail Gursoy, M. EEG signal classification using PCA, ICA, LDA and support vector machines. Expert Syst. Appl., 2010, 37(12), 8659-8666.
[http://dx.doi.org/10.1016/j.eswa.2010.06.065]
[35]
Heikamp, K.; Bajorath, J. Support vector machines for drug discovery. Expert Opin. Drug Discov., 2014, 9(1), 93-104.
[http://dx.doi.org/10.1517/17460441.2014.866943] [PMID: 24304044]
[36]
Bentaouza, C.M.; Benyettou, M. Support vector machines for microscopic medical images compression. Pak. J. Biol. Sci., 2014, 17(3), 335-345.
[http://dx.doi.org/10.3923/pjbs.2014.335.345] [PMID: 24897787]
[37]
Yang, H.; Lv, H.; Ding, H.; Chen, W.; Lin, H. iRNA-2OM: A sequence-based predictor for identifying 2′-O-methylation sites in Homo sapiens. J. Comput. Biol., 2018, 25(11), 1266-1277.
[http://dx.doi.org/10.1089/cmb.2018.0004] [PMID: 30113871]
[38]
Xu, Z.C.; Feng, P.M.; Yang, H.; Qiu, W.R.; Chen, W.; Lin, H. iRNAD: A computational tool for identifying D modification sites in RNA sequence. Bioinformatics, 2019. [Epub ahead of print]
[http://dx.doi.org/10.1093/bioinformatics/btz358] [PMID: 31077296]
[39]
Tang, Q.; Nie, F.; Kang, J.; Ding, H.; Zhou, P.; Huang, J. NIEluter: Predicting peptides eluted from HLA class I molecules. J. Immunol. Methods, 2015, 422, 22-27.
[http://dx.doi.org/10.1016/j.jim.2015.03.021] [PMID: 25862605]
[40]
Ru, B.; Hoen, P.A.; Nie, F.; Lin, H.; Guo, F.B.; Huang, J. PhD7Faster: Predicting clones propagating faster from the Ph.D.-7 phage display peptide library. J. Bioinform. Comput. Biol., 2014, 12(1) 1450005
[http://dx.doi.org/10.1142/S021972001450005X] [PMID: 24467763]
[41]
He, B.; Kang, J.; Ru, B.; Ding, H.; Zhou, P.; Huang, J. SABinder: A web service for predicting streptavidin-binding Peptides. BioMed Res. Int., 2016, 2016 9175143
[http://dx.doi.org/10.1155/2016/9175143] [PMID: 27610387]
[42]
Li, N.; Kang, J.; Jiang, L.; He, B.; Lin, H.; Huang, J. PSBinder: A web service for predicting polystyrene surface-binding peptides. BioMed Res. Int., 2017, 2017 5761517
[http://dx.doi.org/10.1155/2017/5761517] [PMID: 29445741]
[43]
Dzisoo, A.M.; He, B.; Karikari, R.; Agoalikum, E.; Huang, J. CISI: A tool for predicting cross-interaction or self-interaction of monoclonal antibodies using sequences. Interdiscip. Sci., 2019, 11(4), 691-697.
[http://dx.doi.org/10.1007/s12539-019-00330-1] [PMID: 31119495]
[44]
Feng, C.Q.; Zhang, Z.Y.; Zhu, X.J.; Lin, Y.; Chen, W.; Tang, H.; Lin, H. iTerm-PseKNC: A sequence-based tool for predicting bacterial transcriptional terminators. Bioinformatics, 2019, 35(9), 1469-1477.
[http://dx.doi.org/10.1093/bioinformatics/bty827] [PMID: 30247625]
[45]
Tan, J.X.; Li, S.H.; Zhang, Z.M.; Chen, C.X.; Chen, W.; Tang, H.; Lin, H. Identification of hormone binding proteins based on machine learning methods. Math. Biosci. Eng., 2019, 16(4), 2466-2480.
[http://dx.doi.org/10.3934/mbe.2019123] [PMID: 31137222]
[46]
Chen, W.; Ding, H.; Zhou, X.; Lin, H.; Chou, K.C. iRNA(m6A)-PseDNC: Identifying N6-methyladenosine sites using pseudo dinucleotide composition. Anal. Biochem., 2018, 561-562, 59-65.
[http://dx.doi.org/10.1016/j.ab.2018.09.002] [PMID: 30201554]
[47]
Yang, H.; Qiu, W.R.; Liu, G.; Guo, F.B.; Chen, W.; Chou, K.C.; Lin, H. iRSpot-Pse6NC: Identifying recombination spots in Saccharomyces cerevisiae by incorporating hexamer composition into general PseKNC. Int. J. Biol. Sci., 2018, 14(8), 883-891.
[http://dx.doi.org/10.7150/ijbs.24616] [PMID: 29989083]
[48]
Qiu, W.R.; Sun, B.Q.; Tang, H.; Huang, J.; Lin, H. Identify and analysis crotonylation sites in histone by using support vector machines. Artif. Intell. Med., 2017, 83, 75-81.
[http://dx.doi.org/10.1016/j.artmed.2017.02.007] [PMID: 28283358]
[49]
Chang, C-C.; Lin, C-J. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol., 2011, 2(3), 1-27.
[http://dx.doi.org/10.1145/1961189.1961199]
[50]
Su, R.; Hu, J.; Zou, Q.; Manavalan, B.; Wei, L. Empirical comparison and analysis of web-based cell-penetrating peptide prediction tools. Brief. Bioinform., 2019. [Epub ahead of print]
[http://dx.doi.org/10.1093/bib/bby124] [PMID: 30649170]
[51]
Rholam, M.; Brakch, N.; Germain, D.; Thomas, D.Y.; Fahy, C.; Boussetta, H.; Boileau, G.; Cohen, P. Role of amino acid sequences flanking dibasic cleavage sites in precursor proteolytic processing. The importance of the first residue C-terminal of the cleavage site. Eur. J. Biochem., 1995, 227(3), 707-714.
[http://dx.doi.org/10.1111/j.1432-1033.1995.tb20192.x] [PMID: 7867629]


Rights & PermissionsPrintExport Cite as


Article Details

VOLUME: 27
ISSUE: 4
Year: 2020
Page: [337 - 345]
Pages: 9
DOI: 10.2174/0929866526666191112150636
Price: $65

Article Metrics

PDF: 6