pLoc_bal-mVirus: Predict Subcellular Localization of Multi-Label Virus Proteins by Chou's General PseAAC and IHTS Treatment to Balance Training Dataset

Author(s): Xuan Xiao*, Xiang Cheng, Genqiang Chen, Qi Mao, Kuo-Chen Chou.

Journal Name: Medicinal Chemistry

Volume 15 , Issue 5 , 2019

Become EABM
Become Reviewer

Graphical Abstract:


Abstract:

Background/Objective: Knowledge of protein subcellular localization is vitally important for both basic research and drug development. Facing the avalanche of protein sequences emerging in the post-genomic age, it is urgent to develop computational tools for timely and effectively identifying their subcellular localization based on the sequence information alone. Recently, a predictor called “pLoc-mVirus” was developed for identifying the subcellular localization of virus proteins. Its performance is overwhelmingly better than that of the other predictors for the same purpose, particularly in dealing with multi-label systems in which some proteins, known as “multiplex proteins”, may simultaneously occur in, or move between two or more subcellular location sites. Despite the fact that it is indeed a very powerful predictor, more efforts are definitely needed to further improve it. This is because pLoc-mVirus was trained by an extremely skewed dataset in which some subset was over 10 times the size of the other subsets. Accordingly, it cannot avoid the biased consequence caused by such an uneven training dataset.

Methods: Using the Chou's general PseAAC (Pseudo Amino Acid Composition) approach and the IHTS (Inserting Hypothetical Training Samples) treatment to balance out the training dataset, we have developed a new predictor called “pLoc_bal-mVirus” for predicting the subcellular localization of multi-label virus proteins.

Results: Cross-validation tests on exactly the same experiment-confirmed dataset have indicated that the proposed new predictor is remarkably superior to pLoc-mVirus, the existing state-of-theart predictor for the same purpose.

Conclusion: Its user-friendly web-server is available at http://www.jci-bioinfo.cn/pLoc_balmVirus/, by which the majority of experimental scientists can easily get their desired results without the need to go through the detailed complicated mathematics. Accordingly, pLoc_bal-mVirus will become a very useful tool for designing multi-target drugs and in-depth understanding of the biological process in a cell.

Keywords: Multi-label system, virus proteins, multi-target drugs, Chou's 5-step rules, Chou's general PseAAC, ML-GKR, Chou's intuitive metrics.

[1]
Cheng, X.; Xiao, X. pLoc-mVirus: Predict subcellular localization of multi-location virus proteins via incorporating the optimal GO information into general PseAAC. Gene(Erratum: ibid., 2018, Vol. 644, 156-156), 2017, 628, 315-321.
[2]
Cedano, J.; Aloy, P.; Perez-Pons, J.A.; Querol, E. Relation between amino acid composition and cellular location of proteins. J. Mol. Biol., 1997, 266, 594-600.
[3]
Chou, K.C.; Elrod, D.W. Using discriminant function for prediction of subcellular location of prokaryotic proteins. Biochem. Biophys. Res. Commun., 1998, 252, 63-68.
[4]
Reinhardt, A.; Hubbard, T. Using neural networks for prediction of the subcellular location of proteins. Nucleic Acids Res., 1998, 26, 2230-2236.
[5]
Chou, K.C.; Elrod, D.W. Protein subcellular location prediction. Protein Eng., 1999, 12, 107-118.
[6]
Chou, K.C.; Elrod, D.W. Prediction of membrane protein types and subcellular locations. Proteins Struct. Funct. Genet., 1999, 34, 137-153.
[7]
Emanuelsson, O.; Nielsen, H.; Brunak, S.; von Heijne, G. Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J. Mol. Biol., 2000, 300, 1005-1016.
[8]
Chou, K.C. Prediction of protein subcellular locations by incorporating quasi-sequence-order effect. Biochem. Biophys. Res. Commun., 2000, 278, 477-483.
[9]
Chou, K.C. Prediction of protein cellular attributes using pseudo amino acid composition. Proteins. Struct. Funct. Genet.(Erratum: ibid., 2001, Vol.44, 60), 2001, 43, 246-255.
[10]
Cai, Y.D.; Liu, X.J.; Xu, X.B. Support vector machines for prediction of protein subcellular location by incorporating quasi-sequence-order effect. J. Cell. Biochem., 2002, 84, 343-348.
[11]
Chou, K.C.; Cai, Y.D. Using functional domain composition and support vector machines for prediction of protein subcellular location. J. Biol. Chem., 2002, 277, 45765-45769.
[12]
Park, K.J.; Kanehisa, M. Prediction of protein subcellular locations by support vector machines using compositions of amino acid and amino acid pairs. Bioinformatics, 2003, 19, 1656-1663.
[13]
Chou, K.C.; Cai, Y.D. Prediction and classification of protein subcellular location: Sequence-order effect and pseudo amino acid composition. J. Cell. Biochem.(Addendum, ibid. 2004, 91, 1085), 2003, 90, 1250-1260.
[14]
Gardy, J.L.; Spencer, C.; Wang, K.; Ester, M.; Tusnady, G.E.; Simon, I.; Hua, S.; deFays, K.; Lambert, C.; Nakai, K.; Brinkman, F.S. PSORT-B: Improving protein subcellular localization prediction for Gram-negative bacteria. Nucleic Acids Res., 2003, 31, 3613-3617.
[15]
Cai, Y.D. A new hybrid approach to predict subcellular localization of proteins by incorporating gene ontology. Biochem. Biophys. Res. Commun., 2003, 311, 743-747.
[16]
Matsuda, S.; Vert, J.P.; Saigo, H.; Ueda, N.; Toh, H.; Akutsu, T. A novel representation of protein sequences for prediction of subcellular location using support vector machines. Protein Sci., 2005, 14, 2804-2813.
[17]
Chou, K.C.; Shen, H.B. Predicting protein subcellular location by fusing multiple classifiers. J. Cell. Biochem., 2006, 99, 517-527.
[18]
Shen, H.B. Gpos-PLoc: An ensemble classifier for predicting subcellular localization of Gram-positive bacterial proteins. Protein Eng. Des. Sel., 2007, 20, 39-46.
[19]
Ding, Y.S.; Zhang, T.L. Using Chou’s pseudo amino acid composition to predict subcellular localization of apoptosis proteins: an approach with immune genetic algorithm-based ensemble classifier. Pattern Recognit. Lett., 2008, 29, 1887-1892.
[20]
Lin, J.; Wang, Y. Using a novel AdaBoost algorithm and Chou’s pseudo amino acid composition for predicting protein subcellular localization. Protein Pept. Lett., 2011, 18, 1219-1225.
[21]
Hu, L.; Huang, T.; Shi, X.; Lu, W.C.; Cai, Y.D. Predicting functions of proteins in mouse based on weighted protein-protein interaction network and protein hybrid properties. PLoS One, 2011, 6e14556
[22]
Fan, G.L.; Li, Q.Z. Predict mycobacterial proteins subcellular locations by incorporating pseudo-average chemical shift into the general form of Chou’s pseudo amino acid composition. J. Theor. Biol., 2012, 304, 88-95.
[23]
Dehzangi, A.; Heffernan, R.; Sharma, A.; Lyons, J.; Paliwal, K.; Sattar, A. Gram-positive and Gram-negative protein subcellular localization by incorporating evolutionary-based descriptors into Chou’s general PseAAC. J. Theor. Biol., 2015, 364, 284-294.
[24]
Sharma, R.; Dehzangi, A.; Lyons, J.; Paliwal, K.; Tsunoda, T.; Sharma, A. Predict Gram-positive and Gram-negative subcellular localization via incorporating evolutionary information and physicochemical features into Chou’s general PseAAC. IEEE Trans. Nanobioscience, 2015, 14, 915-926.
[25]
Nakai, K. Protein sorting signals and prediction of subcellular localization. Adv. Protein Chem., 2000, 54, 277-344.
[26]
Chou, K.C.; Shen, H.B. Recent progresses in protein subcellular location prediction. Anal. Biochem., 2007, 370, 1-16.
[27]
Chou, K.C.; Shen, H.B. Euk-mPLoc: A fusion classifier for large-scale eukaryotic protein subcellular location prediction by incorporating multiple sites. J. Proteome Res., 2007, 6, 1728-1734.
[28]
Shen, H.B. Hum-mPLoc: An ensemble classifier for large-scale human protein subcellular location prediction by incorporating samples with multiple sites. Biochem. Biophys. Res. Commun., 2007, 355, 1006-1011.
[29]
Chou, K.C.; Shen, H.B. Cell-PLoc: A package of Web servers for predicting subcellular localization of proteins in various organisms. Nat. Protoc., 2008, 3, 153-162.
[30]
Shen, H.B. Virus-mPLoc: A fusion classifier for viral protein subcellular location prediction by incorporating multiple sites. J. Biomol. Struct. Dyn., 2010, 28, 175-186.
[31]
Mei, S. Predicting plant protein subcellular multi-localization by Chou’s PseAAC formulation based multi-label homolog knowledge transfer learning. J. Theor. Biol., 2012, 310, 80-87.
[32]
Pacharawongsakda, E.; Theeramunkong, T. Predict subcellular locations of singleplex and multiplex proteins by semi-supervised learning and dimension-reducing general mode of Chou’s PseAAC. IEEE Trans. Nanobioscience, 2013, 12, 311-320.
[33]
Wang, X.; Li, G.Z.; Lu, W.C. Virus-ECC-mPLoc: A multi-label predictor for predicting the subcellular localization of virus proteins with both single and multiple sites based on a general form of Chou’s pseudo amino acid composition. Protein Pept. Lett., 2013, 20, 309-317.
[34]
Wang, X.; Zhang, W.; Zhang, Q.; Li, G.Z. MultiP-SChlo: Multi-label protein subchloroplast localization prediction with Chou’s pseudo amino acid composition and a novel multi-label classifier. Bioinformatics, 2015, 31, 2639-2645.
[35]
Glory, E.; Murphy, R.F. Automated subcellular location determination and high-throughput microscopy. Dev. Cell, 2007, 12, 7-16.
[36]
Wang, S.Q.; Cheng, X.C.; Dong, W.L.; Wang, R.L. Three new powerful Oseltamivir derivatives for inhibiting the neuraminidase of influenza virus. Biochem. Biophys. Res. Commun., 2010, 401, 188-191.
[37]
Liu, L.; Ma, Y.; Wang, R.L.; Xu, W.R.; Wang, S.Q. Find novel dual-agonist drugs for treating type 2 diabetes by means of cheminformatics. Drug Des. Devel. Ther., 2013, 7, 279-287.
[38]
Ma, Y.; Wang, S.Q.; Xu, W.R.; Wang, R.L. Design novel dual agonists for treating type-2 diabetes by targeting peroxisome proliferator-activated receptors with core hopping approach. PLoS One, 2012, 7e38546
[39]
Chou, K.C. Some remarks on predicting multi-label attributes in molecular biosystems. Mol. Biosyst., 2013, 9, 1092-1100.
[40]
Xiao, X.; Wu, Z.C. iLoc-Virus: A multi-label learning classifier for identifying the subcellular localization of virus proteins with both single and multiple sites. J. Theor. Biol., 2011, 284, 42-51.
[41]
Liu, Z.; Xiao, X.; Qiu, W.R. iDNA-Methyl: Identifying DNA methylation sites via pseudo trinucleotide composition. Anal. Biochem., 2015, 474, 69-77.
[42]
Xiao, X.; Min, J.L.; Lin, W.Z.; Liu, Z.; Cheng, X. iDrug-Target: Predicting the interactions between drug compounds and target proteins in cellular networking via the benchmark dataset optimization approach. J. Biomol. Struct. Dyn., 2015, 33, 2221-2233.
[43]
Jia, J.; Liu, Z.; Xiao, X.; Liu, B. iSuc-PseOpt: Identifying lysine succinylation sites in proteins by incorporating sequence-coupling effects into pseudo components and optimizing imbalanced training dataset. Anal. Biochem., 2016, 497, 48-56.
[44]
Jia, J.; Liu, Z.; Xiao, X.; Liu, B. iPPBS-Opt: A sequence-based ensemble classifier for identifying protein-protein binding sites by optimizing imbalanced training datasets. Molecules, 2016, 21E95
[45]
Liu, B.; Yang, F.; Huang, D.S. iPromoter-2L: A two-layer predictor for identifying promoters and their types by multi-window-based PseKNC. Bioinformatics, 2018, 34, 33-40.
[46]
Chen, W.; Tang, H.; Ye, J.; Lin, H. iRNA-PseU: Identifying RNA pseudouridine sites. Mol. Ther. Nucleic Acids, 2016, 5e332
[47]
Cheng, X.; Xiao, X. pLoc-mPlant: Predict subcellular localization of multi-location plant proteins via incorporating the optimal GO information into general PseAAC. Mol. Biosyst., 2017, 13, 1722-1727.
[48]
Feng, P.; Ding, H.; Yang, H.; Chen, W.; Lin, H. iRNA-PseColl: Identifying the occurrence sites of different RNA modifications by incorporating collective effects of nucleotides into PseKNC. Mol. Ther. Nucleic Acids, 2017, 7, 155-163.
[49]
Cheng, X.; Zhao, S.G.; Lin, W.Z.; Xiao, X. pLoc-mAnimal: Predict subcellular localization of animal proteins with both single and multiple sites. Bioinformatics, 2017, 33, 3524-3531.
[50]
Liu, B.; Yang, F. 2L-piRNA: A two-layer ensemble classifier for identifying piwi-interacting RNAs and their function. Mol. Ther. Nucleic Acids, 2017, 7, 267-277.
[51]
Cheng, X.; Zhao, S.G.; Xiao, X. iATC-mISF: A multi-label classifier for predicting the classes of anatomical therapeutic chemicals. Bioinformatics(Corrigendum, ibid., 2017, Vol.33, 2610), 2017, 33, 341-346.
[52]
Liu, B.; Wang, S.; Long, R. iRSpot-EL: Identify recombination spots with an ensemble learning approach. Bioinformatics, 2017, 33, 35-41.
[53]
Xiao, X.; Cheng, X.; Su, S.; Nao, Q. pLoc-mGpos: Incorporate key gene ontology information into general PseAAC for predicting subcellular localization of Gram-positive bacterial proteins. Nat. Sci., 2017, 9, 331-349.
[54]
Qiu, W.R.; Jiang, S.Y.; Xu, Z.C.; Xiao, X. iRNAm5C-PseDNC: Identifying RNA 5-methylcytosine sites by incorporating physical-chemical properties into pseudo dinucleotide composition. Oncotarget, 2017, 8, 41178-41188.
[55]
Qiu, W.R.; Sun, B.Q.; Xiao, X.; Xu, Z.C.; Jia, J.H. iKcr-PseEns: Identify lysine crotonylation sites in histone proteins with pseudo components and ensemble classifier. Genomics, 2018, 110, 239-246.
[56]
Cheng, X.; Xiao, X. pLoc-mEuk: Predict subcellular localization of multi-label eukaryotic proteins by extracting the key GO information into general PseAAC. Genomics, 2018, 110, 50-58.
[57]
Li, F.; Li, C.; Marquez-Lago, T.T.; Leier, A.; Akutsu, T.; Purcell, A.W.; Smith, A.I.; Lightow, T.; Daly, R.J.; Song, J. Quokka: A comprehensive tool for rapid and accurate prediction of kinase family-specific phosphorylation sites in the human proteome. Bioinformatics, 2018, 34(24), 4223-4231.
[58]
Song, J.; Li, F.; Takemoto, K.; Haffari, G.; Akutsu, T.; Webb, G.I. PREvaIL, an integrative approach for inferring catalytic residues using sequence, structural and network features in a machine learning framework. J. Theor. Biol., 2018, 443, 125-137.
[59]
Song, J.; Wang, Y.; Li, F.; Akutsu, T.; Rawlings, N.D.; Webb, G.I. iProt-Sub: A comprehensive package for accurately mapping and predicting protease-specific substrates and cleavage sites. Brief. Bioinform., 2018. [Epub ahead of print].
[http://dx.doi.org/10.1093/bib/bby028]
[60]
Cheng, X.; Xiao, X. pLoc-mGneg: Predict subcellular localization of Gram-negative bacterial proteins by deep gene ontology learning via general PseAAC. Genomics, 2018, 110, 231-239.
[61]
Chen, W.; Feng, P.; Yang, H.; Ding, H.; Lin, H. iRNA-3typeA: Identifying 3-types of modification at RNA’s adenosine sites. Mol. Ther. Nucleic Acids, 2018, 11, 468-474.
[62]
Feng, P.; Yang, H.; Ding, H.; Lin, H.; Chen, W. iDNA6mA-PseKNC: Identifying DNA N6-methyladenosine sites by incorporating nucleotide physicochemical properties into PseKNC. Genomics, 2019, 111, 96-102.
[63]
Liu, B.; Weng, F.; Huang, D.S. iRO-3wPseKNC: Identify DNA replication origins by three-window-based PseKNC. Bioinformatics, 2018, 34, 3086-3093.
[64]
Yang, H.; Qiu, W.R.; Liu, G.; Guo, F.B.; Chen, W.; Lin, H. iRSpot-Pse6NC: Identifying recombination spots in Saccharo-myces cerevisiae by incorporating hexamer composition into general PseKNC. Int. J. Biol. Sci., 2018, 14, 883-891.
[65]
Khan, Y.D.; Rasool, N.; Hussain, W.; Khan, S.A. iPhosT-PseAAC: Identify phosphothreonine sites by incorporating sequence statistical moments into PseAAC. Anal. Biochem., 2018, 550, 109-116.
[66]
Cheng, X.; Xiao, X. pLoc-mHum: Predict subcellular localization of multi-location human proteins via general PseAAC to winnow out the crucial GO information. Bioinformatics, 2018, 34, 1448-1456.
[67]
Su, Z.D.; Huang, Y.; Zhang, Z.Y.; Zhao, Y.W.; Wang, D.; Chen, W.; Lin, H. iLoc-lncRNA: Predict the subcellular location of lncRNAs by incorporating octamer composition into general PseKNC. Bioinformatics, 2018, 34(24), 4196-4204.
[68]
Chen, W.; Ding, H.; Zhou, X.; Lin, H. iRNA(m6A)-PseDNC: Identifying N6-methyladenosine sites using pseudo dinucleotide composition. Anal. Biochem., 2018, 561-562, 59-65.
[69]
Jia, J.; Li, X.; Qiu, W.; Xiao, X. iPPI-PseAAC(CGR): Identify protein-protein interactions by incorporating chaos game representation into PseAAC. J. Theor. Biol., 2019, 460, 195-203.
[70]
Chou, K.C. Some remarks on protein attribute prediction and pseudo amino acid composition (50th Anniversary Year Review). J. Theor. Biol., 2011, 273, 236-247.
[71]
Zhang, C.T. An optimization approach to predicting protein structural class from amino acid composition. Protein Sci., 1992, 1, 401-408.
[72]
Chou, K.C.; Zhang, C.T. A correlation coefficient method to predicting protein structural classes from amino acid compositions. Eur. J. Biochem., 1992, 207, 429-433.
[73]
Chou, J.J. Predicting cleavability of peptide sequences by HIV protease via correlation-angle approach. J. Protein Chem., 1993, 12, 291-302.
[74]
Chou, J.J. A formulation for correlating properties of peptides and its application to predicting human immunodeficiency virus protease-cleavable sites in proteins. Biopolymers, 1993, 33, 1405-1414.
[75]
Chou, J.J.; Zhang, C.T. A joint prediction of the folding types of 1490 human proteins from their genetic codons. J. Theor. Biol., 1993, 161, 251-262.
[76]
Chou, K.C.; Elrod, D.W. Bioinformatical analysis of G-protein-coupled receptors. J. Proteome Res., 2002, 1, 429-433.
[77]
Chen, W.; Lin, H.; Feng, P.M.; Ding, C.; Zuo, Y.C. iNuc-PhysChem: A sequence-based predictor for identifying nucleosomes via physicochemical properties. PLoS One, 2012, 7e47843
[78]
Xu, Y.; Ding, J.; Wu, L.Y. iSNO-PseAAC: Predict cysteine S-nitrosylation sites in proteins by incorporating position specific amino acid propensity into pseudo amino acid composition. PLoS One, 2013, 8e55844
[79]
Xiao, X.; Wang, P. iNR-PhysChem: A sequence-based predictor for identifying nuclear receptors and their subfamilies via physical-chemical property matrix. PLoS One, 2012, 7e30869
[80]
Cai, Y.D.; Feng, K.Y.; Lu, W.C. Using LogitBoost classifier to predict protein structural classes. J. Theor. Biol., 2006, 238, 172-176.
[81]
Feng, P.M.; Chen, W.; Lin, H. iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition. Anal. Biochem., 2013, 442, 118-125.
[82]
Chen, W.; Feng, P.M.; Lin, H. iRSpot-PseDNC: Identify recombination spots with pseudo dinucleotide composition. Nucleic Acids Res., 2013, 41e68
[83]
Cai, Y.D. Predicting subcellular localization of proteins in a hybridization space. Bioinformatics, 2004, 20, 1151-1156.
[84]
Cai, Y.D. Prediction of protease types in a hybridization space. Biochem. Biophys. Res. Commun., 2006, 339, 1015-1020.
[85]
Lin, W.Z.; Fang, J.A.; Xiao, X. iDNA-Prot: Identification of DNA binding proteins using random forest with grey model. PLoS One, 2011, 6e24756
[86]
Kandaswamy, K.K.; Martinetz, T.; Moller, S.; Suganthan, P.N.; Sridharan, S.; Pugalenthi, G. AFP-Pred: A random forest approach for predicting antifreeze proteins from sequence-derived properties. J. Theor. Biol., 2011, 270, 56-62.
[87]
Chou, K.C. Impacts of bioinformatics to medicinal chemistry. Med. Chem., 2015, 11, 218-234.
[88]
Chou, K.C. Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes. Bioinformatics, 2005, 21, 10-19.
[89]
Xiao, X.; Shao, S.; Ding, Y.; Huang, Z.; Chen, X. Using cellular automata to generate Image representation for biological sequences. Amino Acids, 2005, 28, 29-35.
[90]
Mundra, P.; Kumar, M.; Kumar, K.K.; Jayaraman, V.K.; Kulkarni, B.D. Using pseudo amino acid composition to predict protein subnuclear localization: Approached with PSSM. Pattern Recognit. Lett., 2007, 28, 1610-1615.
[91]
Zhou, X.B.; Chen, C.; Li, Z.C.; Zou, X.Y. Using Chou’s amphiphilic pseudo amino acid composition and support vector machine for prediction of enzyme subfamily classes. J. Theor. Biol., 2007, 248, 546-551.
[92]
Ding, Y.S.; Zhang, T.L. Prediction of protein structure classes with pseudo amino acid composition and fuzzy support vector machine network. Protein Pept. Lett., 2007, 14, 811-815.
[93]
Nanni, L.; Lumini, A. Genetic programming for creating Chou’s pseudo amino acid based features for submitochondria localization. Amino Acids, 2008, 34, 653-660.
[94]
Zhang, G.Y.; Fang, B.S. Predicting the cofactors of oxidoreductases based on amino acid composition distribution and Chou’s amphiphilic pseudo amino acid composition. J. Theor. Biol., 2008, 253, 310-315.
[95]
Jiang, X.; Wei, R.; Zhao, Y.; Zhang, T. Using Chou’s pseudo amino acid composition based on approximate entropy and an ensemble of AdaBoost classifiers to predict protein subnuclear location. Amino Acids, 2008, 34, 669-675.
[96]
Georgiou, D.N.; Karakasidis, T.E.; Nieto, J.J.; Torres, A. Use of fuzzy clustering technique and matrices to classify amino acids and its impact to Chou’s pseudo amino acid composition. J. Theor. Biol., 2009, 257, 17-26.
[97]
Ding, H.; Luo, L.; Lin, H. Prediction of cell wall lytic enzymes using Chou’s amphiphilic pseudo amino acid composition. Protein Pept. Lett., 2009, 16, 351-355.
[98]
Zeng, Y.H.; Guo, Y.Z.; Xiao, R.Q.; Yang, L.; Yu, L.Z.; Li, M.L. Using the augmented Chou’s pseudo amino acid composition for predicting protein submitochondria locations based on auto covariance approach. J. Theor. Biol., 2009, 259, 366-372.
[99]
Qiu, J.D.; Huang, J.H.; Liang, R.P.; Lu, X.Q. Prediction of G-protein-coupled receptor classes based on the concept of Chou’s pseudo amino acid composition: An approach from discrete wavelet transform. Anal. Biochem., 2009, 390, 68-73.
[100]
Esmaeili, M.; Mohabatkar, H.; Mohsenzadeh, S. Using the concept of Chou’s pseudo amino acid composition for risk type prediction of human papillomaviruses. J. Theor. Biol., 2010, 263, 203-209.
[101]
Mohabatkar, H. Prediction of cyclin proteins using Chou’s pseudo amino acid composition. Protein Pept. Lett., 2010, 17, 1207-1214.
[102]
Gu, Q.; Ding, Y.S.; Zhang, T.L. Prediction of G-protein-coupled receptor classes in low homology using Chou’s pseudo amino acid composition with approximate entropy and hydrophobicity patterns. Protein Pept. Lett., 2010, 17, 559-567.
[103]
Sahu, S.S.; Panda, G. A novel feature representation method based on Chou’s pseudo amino acid composition for protein structural class prediction. Comput. Biol. Chem., 2010, 34, 320-327.
[104]
Yu, L.; Guo, Y.; Li, Y.; Li, G.; Li, M.; Luo, J.; Xiong, W.; Qin, W.; Secret, P. Identifying bacterial secreted proteins by fusing new features into Chou’s pseudo amino acid composition. J. Theor. Biol., 2010, 267, 1-6.
[105]
Mohabatkar, H.; Mohammad Beigi, M.; Esmaeili, A. Prediction of GABA(A) receptor proteins using the concept of Chou’s pseudo amino acid composition and support vector machine. J. Theor. Biol., 2011, 281, 18-23.
[106]
Mohammad, B.M.; Behjati, M.; Mohabatkar, H. Prediction of metalloproteinase family based on the concept of Chou’s pseudo amino acid composition using a machine learning approach. J. Struct. Funct. Genomics, 2011, 12, 191-197.
[107]
Zou, D.; He, Z.; He, J.; Xia, Y. Supersecondary structure prediction using Chou’s pseudo amino acid composition. J. Comput. Chem., 2011, 32, 271-278.
[108]
Qiu, J.D.; Suo, S.B.; Sun, X.Y.; Shi, S.P.; Liang, R.P. OligoPred: A web-server for predicting homo-oligomeric proteins by incorporating discrete wavelet transform into Chou’s pseudo amino acid composition. J. Mol. Graph. Model., 2011, 30, 129-134.
[109]
Hayat, M.; Khan, A. Discriminating outer membrane proteins with fuzzy K-nearest neighbor algorithms based on the general form of Chou’s PseAAC. Protein Pept. Lett., 2012, 19, 411-421.
[110]
Nanni, L.; Lumini, A.; Gupta, D.; Garg, A. Identifying bacterial virulent proteins by fusing a set of classifiers based on variants of Chou’s pseudo amino acid composition and on evolutionary information. IEEE-ACM Trans. Comput. Biol. Bioinform., 2012, 9, 467-475.
[111]
Nanni, L.; Brahnam, S.; Lumini, A. Wavelet images and Chou’s pseudo amino acid composition for protein classification. Amino Acids, 2012, 43, 657-665.
[112]
Zia-ur-Rehman Khan, A. Identifying GPCRs and their types with Chou’s pseudo amino acid composition: An approach from multi-scale energy representation and position specific scoring matrix. Protein Pept. Lett., 2012, 19, 890-903.
[113]
Mei, S. Multi-kernel transfer learning based on Chou’s PseAAC formulation for protein submitochondria localization. J. Theor. Biol., 2012, 293, 121-130.
[114]
Sun, X.Y.; Shi, S.P.; Qiu, J.D.; Suo, S.B.; Huang, S.Y.; Liang, R.P. Identifying protein quaternary structural attributes by incorporating physicochemical properties into the general form of Chou’s PseAAC via discrete wavelet transform. Mol. Biosyst., 2012, 8, 3178-3184.
[115]
Gupta, M.K.; Niyogi, R.; Misra, M. An alignment-free method to find similarity among protein sequences via the general form of Chou’s pseudo amino acid composition. SAR QSAR Environ. Res., 2013, 24, 597-609.
[116]
Khosravian, M.; Faramarzi, F.K.; Beigi, M.M.; Behbahani, M.; Mohabatkar, H. Predicting antibacterial peptides by the concept of Chou’s pseudo amino acid composition and machine learning methods. Protein Pept. Lett., 2013, 20, 180-186.
[117]
Georgiou, D.N.; Karakasidis, T.E.; Megaritis, A.C. A short survey on genetic sequences, Chou’s pseudo amino acid composition and its combination with fuzzy set theory. Open Bioinform. J., 2013, 7, 41-48.
[118]
Mohabatkar, H.; Beigi, M.M.; Abdolahi, K.; Mohsenzadeh, S. Prediction of allergenic proteins by means of the concept of Chou’s pseudo amino acid composition and a machine learning approach. Med. Chem., 2013, 9, 133-137.
[119]
Sarangi, A.N.; Lohani, M.; Aggarwal, R. Prediction of essential proteins in prokaryotes by incorporating various physico-chemical features into the general form of Chou’s pseudo amino acid composition. Protein Pept. Lett., 2013, 20, 781-795.
[120]
Huang, C.; Yuan, J.Q. A multilabel model based on Chou’s pseudo amino acid composition for identifying membrane proteins with both single and multiple functional types. J. Membr. Biol., 2013, 246, 327-334.
[121]
Hayat, M.; Iqbal, N. Discriminating protein structure classes by incorporating pseudo average chemical shift to Chou’s general PseAAC and support vector machine. Comput. Methods Programs Biomed., 2014, 116, 184-192.
[122]
Mondal, S.; Pai, P.P. Chou’s pseudo amino acid composition improves sequence-based antifreeze protein prediction. J. Theor. Biol., 2014, 356, 30-35.
[123]
Ding, H.; Deng, E.Z.; Yuan, L.F.; Liu, L.; Lin, H.; Chen, W. iCTX-Type: A sequence-based predictor for identifying the types of conotoxins in targeting ion channels. BioMed Res. Int., 2014, 2014286419
[124]
Nanni, L.; Brahnam, S.; Lumini, A. Prediction of protein structure classes by incorporating different protein descriptors into general Chou’s pseudo amino acid composition. J. Theor. Biol., 2014, 360, 109-116.
[125]
Hajisharifi, Z.; Piryaiee, M.; Mohammad Beigi, M.; Behbahani, M.; Mohabatkar, H. Predicting anticancer peptides with Chou’s pseudo amino acid composition and investigating their mutagenicity via Ames test. J. Theor. Biol., 2014, 341, 34-40.
[126]
Xu, Y.; Wen, X.; Wen, L.S.; Wu, L.Y.; Deng, N.Y. iNitro-Tyr: Prediction of nitrotyrosine sites in proteins with general pseudo amino acid composition. PLoS One, 2014, 9e105018
[127]
Zuo, Y.C.; Peng, Y.; Liu, L.; Chen, W.; Yang, L.; Fan, G.L. Predicting peroxidase subcellular location by hybridizing different descriptors of Chou’s pseudo amino acid patterns. Anal. Biochem., 2014, 458, 14-19.
[128]
Ahmad, S.; Kabir, M.; Hayat, M. Identification of heat shock protein families and J-protein types by incorporating dipeptide composition into Chou’s general PseAAC. Comput. Methods Programs Biomed., 2015, 122, 165-174.
[129]
Kumar, R.; Srivastava, A.; Kumari, B.; Kumar, M. Prediction of beta-lactamase and its class by Chou’s pseudo amino acid composition and support vector machine. J. Theor. Biol., 2015, 365, 96-103.
[130]
Fan, G.L.; Zhang, X.Y.; Liu, Y.L.; Nang, Y.; Wang, H. DSPMP: Discriminating secretory proteins of malaria parasite by hybridizing different descriptors of Chou’s pseudo amino acid patterns. J. Comput. Chem., 2015, 36, 2317-2327.
[131]
Khan, Z.U.; Hayat, M.; Khan, M.A. Discrimination of acidic and alkaline enzyme using Chou’s pseudo amino acid composition in conjunction with probabilistic neural network model. J. Theor. Biol., 2015, 365, 197-203.
[132]
Liu, B.; Chen, J.; Wang, X. Protein remote homology detection by combining Chou’s distance-pair pseudo amino acid composition and principal component analysis. Mol. Genet. Genomics, 2015, 290, 1919-1931.
[133]
Mandal, M.; Mukhopadhyay, A.; Maulik, U. Prediction of protein subcellular localization by incorporating multiobjective PSO-based feature subset selection into the general form of Chou’s PseAAC. Med. Biol. Eng. Comput., 2015, 53, 331-344.
[134]
Sanchez, V.; Peinado, A.M.; Perez-Cordoba, J.L.; Gomez, A.M. A new signal characterization and signal-based Chou’s PseAAC representation of protein sequences. J. Bioinform. Comput. Biol., 2015, 131550024
[135]
Behbahani, M.; Mohabatkar, H.; Nosrati, M. Analysis and comparison of lignin peroxidases between fungi and bacteria using three different modes of Chou’s general pseudo amino acid composition. J. Theor. Biol., 2016, 411, 1-5.
[136]
Ahmad, K.; Waris, M.; Hayat, M. Prediction of protein submitochondrial locations by incorporating dipeptide composition into Chou’s general pseudo amino acid composition. J. Membr. Biol., 2016, 249, 293-304.
[137]
Kabir, M.; Hayat, M. iRSpot-GAEnsC: Identifing recombination spots via ensemble classifier and extending the concept of Chou’s PseAAC to formulate DNA samples. Mol. Genet. Genomics, 2016, 291, 285-296.
[138]
Tiwari, A.K. Prediction of G-protein coupled receptors and their subfamilies by incorporating various sequence features into Chou’s general PseAAC. Comput. Methods Programs Biomed., 2016, 134, 197-213.
[139]
Meher, P.K.; Sahu, T.K.; Saini, V.; Rao, A.R. Predicting antimicrobial peptides with improved accuracy by incorporating the compositional, physico-chemical and structural features into Chou’s general PseAAC. Sci. Rep., 2017, 7, 42362.
[140]
Rahimi, M.; Bakhtiarizadeh, M.R.; Mohammadi-Sangcheshmeh, A. OOgenesis_Pred: A sequence-based method for predicting oogenesis proteins by six different modes of Chou’s pseudo amino acid composition. J. Theor. Biol., 2017, 414, 128-136.
[141]
Khan, M.; Hayat, M.; Khan, S.A.; Iqbal, N. Unb-DPC: Identify mycobacterial membrane protein types by incorporating un-biased dipeptide composition into Chou’s general PseAAC. J. Theor. Biol., 2017, 415, 13-19.
[142]
Tripathi, P.; Pandey, P.N. A novel alignment-free method to classify protein folding types by combining spectral graph clustering with Chou’s pseudo amino acid composition. J. Theor. Biol., 2017, 424, 49-54.
[143]
Tahir, M.; Hayat, M.; Kabir, M. Sequence based predictor for discrimination of enhancer and their types by applying general form of Chou’s trinucleotide composition. Comput. Methods Programs Biomed., 2017, 146, 69-75.
[144]
Liang, Y.; Zhang, S. Predict protein structural class by incorporating two different modes of evolutionary information into Chou’s general pseudo amino acid composition. J. Mol. Graph. Model., 2017, 78, 110-117.
[145]
Adilina, S.; Farid, D.M.; Shatabda, S. Effective DNA binding protein prediction by using key features via Chou’s general PseAAC. J. Theor. Biol., 2018, 460, 64-78.
[146]
Akbar, S.; Hayat, M. iMethyl-STTNC: Identification of N(6)-methyladenosine sites by extending the Idea of SAAC into Chou’s PseAAC to formulate RNA sequences. J. Theor. Biol., 2018, 455, 205-211.
[147]
Arif, M.; Hayat, M.; Jan, Z. iMem-2LSAAC: A two-level model for discrimination of membrane proteins and their types by extending the notion of SAAC into Chou’s pseudo amino acid composition. J. Theor. Biol., 2018, 442, 11-21.
[148]
Butt, A.H.; Rasool, N.; Khan, Y.D. Predicting membrane proteins and their types by extracting various sequence features into Chou’s general PseAAC. Mol. Biol. Rep., 2018.
[http://dx.doi.org/10.1007/s11033-018-4391-5]
[149]
Chen, G.; Cao, M.; Yu, J.; Guo, X.; Shi, S. Prediction and functional analysis of prokaryote lysine acetylation site by incorporating six types of features into Chou’s general PseAAC. J. Theor. Biol., 2018, 461, 92-101.
[150]
Fu, X.; Zhu, W.; Liso, B.; Cai, L.; Peng, L.; Yang, J. Improved DNA-binding protein identification by incorporating evolutionary information into the Chou’s PseAAC. IEEE Access, 2018, 20
[http://dx.doi.org/10.1109/ACCESS.2018.2876656]
[151]
Contreras-Torres, E. Predicting structural classes of proteins by incorporating their global and local physicochemical and conformational properties into general Chou’s PseAAC. J. Theor. Biol., 2018, 454, 139-145.
[152]
Javed, F.; Hayat, M. Predicting subcellular localizations of multi-label proteins by incorporating the sequence features into Chou’s PseAAC. Genomics, 2018.
[http://dx.doi.org/10.1016/j.ygeno.2018.09.004]
[153]
Ju, Z.; Wang, S.Y. Prediction of citrullination sites by incorporating K-spaced amino acid pairs into Chou’s general pseudo amino acid composition. Gene, 2018, 664, 78-83.
[154]
Liang, Y.; Zhang, S. Identify Gram-negative bacterial secreted protein types by incorporating different modes of PSSM into Chou’s general PseAAC via Kullback-Leibler divergence. J. Theor. Biol., 2018, 454, 22-29.
[155]
Mei, J.; Fu, Y.; Zhao, J. Analysis and prediction of ion channel inhibitors by using feature selection and Chou’s general pseudo amino acid composition. J. Theor. Biol., 2018, 456, 41-48.
[156]
Mei, J.; Zhao, J. Prediction of HIV-1 and HIV-2 proteins by using Chou’s pseudo amino acid compositions and different classifiers. Sci. Rep., 2018, 8, 2359.
[157]
Mei, J.; Zhao, J. Analysis and prediction of presynaptic and postsynaptic neurotoxins by Chou’s general pseudo amino acid composition and motif features. J. Theor. Biol., 2018, 427, 147-153.
[158]
Mousavizadegan, M.; Mohabatkar, H. Computational prediction of antifungal peptides via Chou’s PseAAC and SVM. J. Bioinform. Comput. Biol., 2018.1850016
[159]
Qiu, W.; Li, S.; Cui, X.; Yu, Z.; Wang, M.; Du, J.; Peng, Y.; Yu, B. Predicting protein submitochondrial locations by incorporating the pseudo-position specific scoring matrix into the general Chou’s pseudo-amino acid composition. J. Theor. Biol., 2018, 450, 86-103.
[160]
Rahman, S.M.; Shatabda, S.; Saha, S.; Kaykobad, M.; Sohel Rahman, M. DPP-PseAAC: A DNA-binding protein prediction model using Chou’s general PseAAC. J. Theor. Biol., 2018, 452, 22-34.
[161]
Sankari, E.S.; Manimegalai, D.D. Predicting membrane protein types by incorporating a novel feature set into Chou’s general PseAAC. J. Theor. Biol., 2018, 455, 319-328.
[162]
Srivastava, A.; Kumar, R.; Kumar, M. BlaPred: Predicting and classifying beta-lactamase using a 3-tier prediction system via Chou’s general PseAAC. J. Theor. Biol., 2018, 457, 29-36.
[163]
Wang, L.; Zhang, R.; Mu, Y. Fu-SulfPred: Identification of Protein S-sulfenylation sites by fusing forests via Chou’s general PseAAC. J. Theor. Biol., 2018, 461, 51-58.
[164]
Zhang, S.; Duan, X. Prediction of protein subcellular localization with oversampling approach and Chou’s general PseAAC. J. Theor. Biol., 2018, 437, 239-250.
[165]
Zhang, S.; Liang, Y. Predicting apoptosis protein subcellular localization by integrating auto-cross correlation and PSSM into Chou’s PseAAC. J. Theor. Biol., 2018, 457, 163-169.
[166]
Zhao, W.; Wang, L.; Zhang, T.X.; Zhao, Z.N.; Du, P.F. A brief review on software tools in generating Chou’s pseudo-factor representations for all types of biological sequences. Protein Pept. Lett., 2018.
[http://dx.doi.org/10.2174/0929866525666180905111124]
[167]
Chou, K.C. Pseudo amino acid composition and its applications in bioinformatics, proteomics and system biology. Curr. Proteomics, 2009, 6, 262-274.
[168]
Chou, K.C. An unprecedented revolution in medicinal chemistry driven by the progress of biological science. Curr. Top. Med. Chem., 2017, 17, 2337-2358.
[169]
Shen, H.B. PseAAC: A flexible web-server for generating various kinds of protein pseudo amino acid composition. Anal. Biochem., 2008, 373, 386-388.
[170]
Du, P.; Wang, X.; Xu, C.; Gao, Y. PseAAC-Builder: A cross-platform stand-alone program for generating various special Chou’s pseudo amino acid compositions. Anal. Biochem., 2012, 425, 117-119.
[171]
Cao, D.S.; Xu, Q.S.; Liang, Y.Z. Propy: A tool to generate various modes of Chou’s PseAAC. Bioinformatics, 2013, 29, 960-962.
[172]
Du, P.; Gu, S.; Jiao, Y. PseAAC-General: Fast building various modes of general form of Chou’s pseudo amino acid composition for large-scale protein datasets. Int. J. Mol. Sci., 2014, 15, 3495-3506.
[173]
Chen, W.; Lei, T.Y.; Jin, D.C.; Lin, H. PseKNC: A flexible web-server for generating pseudo K-tuple nucleotide composition. Anal. Biochem., 2014, 456, 53-60.
[174]
Chen, W.; Feng, P.M.; Lin, H. iSS-PseDNC: Identifying splicing sites using pseudo dinucleotide composition. BioMed Res. Int., 2014, 2014623149
[175]
Chen, W.; Lin, H. Pseudo nucleotide composition or PseKNC: An effective formulation for analyzing genomic sequences. Mol. Biosyst., 2015, 11, 2620-2634.
[176]
Liu, B.; Fang, L.; Long, R.; Lan, X. iEnhancer-2L: A two-layer predictor for identifying enhancers and their strength by pseudo K-tuple nucleotide composition. Bioinformatics, 2016, 32, 362-369.
[177]
Liu, B.; Long, R. iDHS-EL: Identifying DNase I hypersensi-tivesites by fusing three different modes of pseudo nucleotide composition into an ensemble learning framework. Bioinformatics, 2016, 32, 2411-2418.
[178]
Al-Maruf, M.A.; Shatabda, S. iRSpot-SF: Prediction of recombination hotspots by incorporating sequence based features into Chou's Pseudo components. Genomics, 2018.,pii: S0888- 7543(18)30214-3.,.
[http://dx.doi.org/10.1016/j.ygeno.2018.06.003]
[179]
Sabooh, M.F.; Iqbal, N.; Khan, M.; Khan, M.; Maqbool, H.F. Identifying 5-methylcytosine sites in RNA sequence using composite encoding feature into Chou’s PseKNC. J. Theor. Biol., 2018, 452, 1-9.
[180]
Zhang, L.; Kong, L. iRSpot-ADPM: Identify recombination spots by incorporating the associated dinucleotide product model into Chou’s pseudo components. J. Theor. Biol., 2018, 441, 1-8.
[181]
Zhang, L.; Kong, L. iRSpot-PDI: Identification of recombination spots by incorporating dinucleotide property diversity information into Chou’s pseudo components. Genomics, 2018.
[http://dx.doi.org/10.1016/j.ygeno.2018.03.003]
[182]
Liu, B.; Liu, F.; Wang, X.; Chen, J.; Fang, L. Pse-in-One: A web server for generating various modes of pseudo components of DNA, RNA, and protein sequences. Nucleic Acids Res., 2015, 43, W65-W71.
[183]
Liu, B.; Wu, H. Pse-in-One 2.0: An improved package of web servers for generating various modes of pseudo components of DNA, RNA, and protein Sequences. Nat. Sci., 2017, 9, 67-91.
[184]
Zhang, C.T. Monte Carlo simulation studies on the prediction of protein folding types from amino acid composition. Biophys. J., 1992, 63, 1523-1529.
[185]
Chou, K.C. A vectorized sequence-coupling model for predicting HIV protease cleavage sites in proteins. J. Biol. Chem., 1993, 268, 16938-16948.
[186]
Zhang, C.T. An analysis of protein folding type prediction by seed-propagated sampling and jackknife test. J. Protein Chem., 1995, 14, 583-593.
[187]
Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res., 2011, 16, 321-357.
[188]
Lin, W.Z.; Fang, J.A.; Xiao, X. iLoc-Animal: A multi-label learning classifier for predicting subcellular localization of animal proteins. Mol. Biosyst., 2013, 9, 634-644.
[189]
Qiu, W.R.; Sun, B.Q.; Xiao, X.; Xu, Z.C. iPTM-mLys: Identifying multiple lysine PTM sites and their different types. Bioinformatics, 2016, 32, 3116-3123.
[190]
Cheng, X.; Zhao, S.G.; Xiao, X. iATC-mHyb: A hybrid multi-label classifier for predicting the classification of anatomical therapeutic chemicals. Oncotarget, 2017, 8, 58494-58503.
[191]
Xuao, X.; Cheng, X.; Chen, G.; Mao, Q. pLoc_bal-mGpos: Predict subcellular localization of Gram-positive bacterial proteins by quasi-balancing training dataset and PseAAC. Genomics, 2018.
[http://dx.doi.org/10.1016/j.ygeno.2018.05.017]
[192]
Chou, K.C.; Zhang, C.T. Review: Prediction of protein structural classes. Crit. Rev. Biochem. Mol. Biol., 1995, 30, 275-349.
[193]
Zhou, G.P.; Assa-Munt, N. Some insights into protein structural class prediction. Proteins Struct. Funct. Genet., 2001, 44, 57-59.
[194]
Elrod, D.W. Prediction of enzyme family classes. J. Proteome Res., 2003, 2, 183-190.
[195]
Chou, K.C.; Shen, H.B. MemType-2L: A Web server for predicting membrane proteins and their types by incorporating evolution information through Pse-PSSM. Biochem. Biophys. Res. Commun., 2007, 360, 339-345.
[196]
Ali, F.; Hayat, M. Classification of membrane protein types using voting feature interval in combination with Chou’s pseudo amino acid composition. J. Theor. Biol., 2015, 384, 78-83.
[197]
Tahir, M.; Hayat, M. iNuc-STNC: A sequence-based predictor for identification of nucleosome positioning in genomes by extending the concept of SAAC and Chou’s PseAAC. Mol. Biosyst., 2016, 12, 2587-2593.
[198]
Ehsan, A.; Mahmood, K.; Khan, Y.D.; Khan, S.A. A novel modeling in mathematical biology for classification of signal peptides. Sci. Rep., 2018, 8, 1039.
[199]
Wu, Z.C.; Xiao, X. iLoc-Gpos: A multi-layer classifier for predicting the subcellular localization of singleplex and multiplex gram-positive bacterial proteins. Protein Pept. Lett., 2012, 19, 4-14.
[200]
Huang, C.; Yuan, J. Using radial basis function on the general form of Chou’s pseudo amino acid composition and PSSM to predict subcellular locations of proteins with both single and multiple sites. Biosystems, 2013, 113, 50-57.
[201]
Huang, C.; Yuan, J.Q. Predicting protein subchloroplast locations with both single and multiple sites via three different modes of Chou’s pseudo amino acid compositions. J. Theor. Biol., 2013, 335, 205-212.
[202]
Xu, Y.; Shao, X.J.; Wu, L.Y.; Deng, N.Y. iSNO-AAPair: Incorporating amino acid pairwise coupling into PseAAC for predicting cysteine S-nitrosylation sites in proteins. PeerJ, 2013, 1e171
[203]
Chou, K.C. Using subsite coupling to predict signal peptides. Protein Eng., 2001, 14, 75-79.
[204]
Chou, K.C. Prediction of signal peptides using scaled window. Peptides, 2001, 22, 1973-1979.
[205]
Lin, H.; Deng, E.Z.; Ding, H.; Chen, W. iPro54-PseKNC: A sequence-based predictor for identifying sigma-54 promoters in prokaryote with pseudo k-tuple nucleotide composition. Nucleic Acids Res., 2014, 42, 12961-12972.
[206]
Qiu, W.R.; Xiao, X. iRSpot-TNCPseAAC: Identify recombination spots with trinucleotide composition and pseudo amino acid components. Int. J. Mol. Sci., 2014, 15, 1746-1766.
[207]
Xu, R.; Zhou, J.; Liu, B.; He, Y.A.; Zou, Q.; Wang, X. Identification of DNA-binding proteins by incorporating evolutionary information into pseudo amino acid composition via the top-n-gram approach. J. Biomol. Struct. Dyn., 2015, 33, 1720-1730.
[208]
Liu, B.; Fang, L.; Wang, S.; Wang, X.; Li, H. Identification of microRNA precursor with the degenerate K-tuple or Kmer strategy. J. Theor. Biol., 2015, 385, 153-159.
[209]
Jia, J.; Liu, Z.; Xiao, X. iPPI-Esml: An ensemble classifier for identifying the interactions of proteins by incorporating their physicochemical properties and wavelet transforms into PseAAC. J. Theor. Biol., 2015, 377, 47-56.
[210]
Chen, W.; Feng, P.; Ding, H.; Lin, H. iRNA-Methyl: Identifying N6-methyladenosine sites using pseudo nucleotide composition. Anal. Biochem., 2015, 490, 26-33.
[211]
Chen, W.; Ding, H.; Feng, P.; Lin, H. iACP: A sequence-based tool for identifying anticancer peptides. Oncotarget, 2016, 7, 16895-16909.
[212]
Chen, W.; Feng, P.; Ding, H.; Lin, H. Using deformation energy to analyze nucleosome positioning in genomes. Genomics, 2016, 107, 69-75.
[213]
Jia, J.; Liu, Z.; Xiao, X.; Liu, B. Identification of protein-protein binding sites by incorporating the physicochemical properties and stationary wavelet transforms into pseudo amino acid composition (iPPBS-PseAAC). J. Biomol. Struct. Dyn., 2016, 34, 1946-1961.
[214]
Jia, J.; Zhang, L.; Liu, Z.; Xiao, X. pSumo-CD: Predicting sumoylation sites in proteins with covariance discriminant algorithm by incorporating sequence-coupled effects into general PseAAC. Bioinformatics, 2016, 32, 3133-3141.
[215]
Qiu, W.R.; Sun, B.Q.; Xiao, X.; Xu, D. iPhos-PseEvo: Identifying human phosphorylated proteins by incorporating evolutionary information into general PseAAC via grey system theory. Mol. Inform., 2017, 36 UNSP 1600010.
[216]
Shen, H.B. Recent advances in developing web-servers for predicting protein attributes. Nat. Sci., 2009, 1, 63-92.
[217]
Shen, H.B. HIVcleave: A web-server for predicting HIV protease cleavage sites in proteins. Anal. Biochem., 2008, 375, 388-390.
[218]
Liu, B.; Fang, L.; Liu, F.; Wang, X.; Chen, J. Identification of real microRNA precursors with a pseudo structure status composition approach. PLoS One, 2015, 10e0121501
[219]
Qiu, W.R.; Sun, B.Q.; Xiao, X.; Xu, Z.C. iHyd-PseCp: Identify hydroxyproline and hydroxylysine in proteins by incorporating sequence-coupled effects into general PseAAC. Oncotarget, 2016, 7, 44310-44321.
[220]
Chen, J.; Long, R.; Wang, X.L.; Liu, B. dRHP-PseRA: Detecting remote homology proteins using profile-based pseudo protein sequence and rank aggregation. Sci. Rep., 2016, 6, 32333.
[221]
Jia, J.; Liu, Z.; Xiao, X.; Liu, B. iCar-PseCp: Identify carbonylation sites in proteins by Monto Carlo sampling and incorporating sequence coupled effects into general PseAAC. Oncotarget, 2016, 7, 34558-34570.
[222]
Liu, B.; Fang, L.; Liu, F.; Wang, X. iMiRNA-PseDPC: microRNA precursor identification with a pseudo distance-pair composition approach. J. Biomol. Struct. Dyn., 2016, 34, 223-235.
[223]
Qiu, W.R.; Xiao, X.; Xu, Z.C. iPhos-PseEn: Identifying phosphorylation sites in proteins by fusing different pseudo components into an ensemble classifier. Oncotarget, 2016, 7, 51270-51283.
[224]
Xiao, X.; Ye, H.X.; Liu, Z.; Jia, J.H. iROS-gPseKNC: Predicting replication origin sites in DNA by incorporating dinucleotide position-specific propensity into general pseudo nucleotide composition. Oncotarget, 2016, 7, 34180-34189.
[225]
Zhang, C.J.; Tang, H.; Li, W.C.; Lin, H.; Chen, W. iOri-Human: Identify human origin of replication by incorporating dinucleotide physicochemical properties into pseudo nucleotide composition. Oncotarget, 2016, 7, 69783-69793.
[226]
Chen, W.; Feng, P.; Yang, H.; Ding, H.; Lin, H. iRNA-AI: Identifying the adenosine to inosine editing sites in RNA sequences. Oncotarget, 2017, 8, 4208-4217.
[227]
Liu, L.M.; Xu, Y. iPGK-PseAAC: Identify lysine phosphogly-cerylation sites in proteins by incorporating four different tiers of amino acid pairwise coupling information into the general PseAAC. Med. Chem., 2017, 13, 552-559.
[228]
Qiu, W.R.; Jiang, S.Y.; Sun, B.Q.; Xiao, X.; Cheng, X. iRNA-2methyl: Identify RNA 2′-O-methylation sites by incorporating sequence-coupled effects into general PseKNC and ensemble classifier. Med. Chem., 2017, 13, 734-743.
[229]
Liu, B.; Wu, H.; Zhang, D.; Wang, X. Pse-Analysis: A python package for DNA/RNA and protein/peptide sequence analysis based on pseudo components and kernel methods. Oncotarget, 2017, 8, 13338-13343.
[230]
Xu, Y.; Li, C. iPreny-PseAAC: Identify C-terminal cysteine prenylation sites in proteins by incorporating two tiers of sequence couplings into PseAAC. Med. Chem., 2017, 13, 544-551.
[231]
Wang, J.; Yang, B.; Leier, A.; Marquez-Lago, T.T.; Hayashida, M.; Rocker, A.; Yanju, Z.; Akutsu, T.; Strugnell, R.A.; Song, J.; Lithgow, T. Bastion6: A bioinformatics approach for accurate prediction of type VI secreted effectors. Bioinformatics, 2018, 34, 2546-2555.
[232]
Liu, B.; Li, K.; Huang, D.S. iEnhancer-EL: Identifying enhancers and their strength with ensemble learning approach. Bioinformatics, 2016, 32, 362-369.
[233]
Chen, Z.; Zhao, P.Y.; Li, F.; Leier, A.; Marquez-Lago, T.T.; Wang, Y.; Webb, G.I.; Smith, A.I.; Daly, R.J.; Song, J. iFeature: A python package and web server for features extraction and selection from protein and peptide sequences. Bioinformatics, 2018, 34, 2499-2502.
[234]
Shen, H.B. Cell-PLoc 2.0: An improved package of web-servers for predicting subcellular localization of proteins in various organisms. Nat. Sci., 2010, 2, 1090-1103.
[235]
Chou, K.C. Advance in predicting subcellular localization of multi-label proteins and its implication for developing multi-target drugs. Curr. Med. Chem., 2019.
[http://dx.doi.org/10.2174/0929867326666190507082559]


Rights & PermissionsPrintExport Cite as

Article Details

VOLUME: 15
ISSUE: 5
Year: 2019
Page: [496 - 509]
Pages: 14
DOI: 10.2174/1573406415666181217114710
Price: $58

Article Metrics

PDF: 29
HTML: 3