Review Article

A Survey for Predicting Enzyme Family Classes Using Machine Learning Methods

Author(s): Jiu-Xin Tan, Hao Lv, Fang Wang, Fu-Ying Dao, Wei Chen* and Hui Ding*

Volume 20, Issue 5, 2019

Page: [540 - 550] Pages: 11

DOI: 10.2174/1389450119666181002143355

Price: $65

Abstract

Enzymes are proteins that act as biological catalysts to speed up cellular biochemical processes. According to their main Enzyme Commission (EC) numbers, enzymes are divided into six categories: EC-1: oxidoreductase; EC-2: transferase; EC-3: hydrolase; EC-4: lyase; EC-5: isomerase and EC-6: synthetase. Different enzymes have different biological functions and acting objects. Therefore, knowing which family an enzyme belongs to can help infer its catalytic mechanism and provide information about the relevant biological function. With the large amount of protein sequences influxing into databanks in the post-genomics age, the annotation of the family for an enzyme is very important. Since the experimental methods are cost ineffective, bioinformatics tool will be a great help for accurately classifying the family of the enzymes. In this review, we summarized the application of machine learning methods in the prediction of enzyme family from different aspects. We hope that this review will provide insights and inspirations for the researches on enzyme family classification.

Keywords: Enzyme, family, classification, machine learning methods.

Graphical Abstract
[1]
Webb EC. Enzyme nomenclatureAcademic Press, SanDiego 1992.
[2]
Jensen LJ, Skovgaard M, Brunak S. Prediction of novel archaeal enzymes from sequence-derived features. Protein Sci 2002; 11: 2894-8.
[3]
Chou KC, Cai YD. Using GO-PseAA predictor to predict enzyme sub-class. Biochem Biophys Res Commun 2004; 325: 506-9.
[4]
Cai CZ, Han LY, Ji ZL, Chen YZ. Enzyme family classification by support vector machines. Proteins 2004; 55: 66-76.
[5]
Cai YD, Chou KC. Using functional domain composition to predict enzyme family classes. J Proteome Res 2005; 4: 109-11.
[6]
Cai YD, Chou KC. Predicting enzyme subclass by functional domain composition and pseudo amino acid composition. J Proteome Res 2005; 4: 967-71.
[7]
Cai YD, Zhou GP, Chou KC. Predicting enzyme family classes by hybridizing gene product composition and pseudo-amino acid composition. J Theor Biol 2005; 234: 145-9.
[8]
Lu L, Qian Z, Cai YD, Li Y. ECS: an automatic enzyme classifier based on functional domain composition. Comput Biol Chem 2007; 31: 226-32.
[9]
Shen HB, Chou KC. EzyPred: a top-down approach for predicting enzyme functional classes and subclasses. Biochem Biophys Res Commun 2007; 364: 53-9.
[10]
Nasibov E, Kandemir-Cavas C. Efficiency analysis of KNN and minimum distance-based classifiers in enzyme family prediction. Comput Biol Chem 2009; 33: 461-4.
[11]
Concu R, Dea-Ayuela MA, Perez-Montoto LG, et al. Prediction of enzyme classes from 3D structure: a general model and examples of experimental-theoretic scoring of peptide mass fingerprints of Leishmania proteins. J Proteome Res 2009; 8: 4372-82.
[12]
Concu R, Dea-Ayuela MA, Perez-Montoto LG, et al. 3D entropy and moments prediction of enzyme classes and experimental-theoretic study of peptide fingerprints in Leishmania parasites. Biochim Biophys Acta 2009; 1794: 1784-94.
[13]
Qiu JD, Huang JH, Shi SP, Liang RP. Using the concept of Chou’s pseudo amino acid composition to predict enzyme family classes: an approach with support vector machine based on discrete wavelet transform. Protein Pept Lett 2010; 17: 715-22.
[14]
Shi R, Hu X. Predicting enzyme subclasses by using support vector machine with composite vectors. Protein Pept Lett 2010; 17: 599-604.
[15]
Volpato V, Adelfio A, Pollastri G. Accurate prediction of protein enzymatic class by N-to-1 Neural Networks. BMC Bioinformatics 2013; 14(Suppl. 1): S11.
[16]
Niu B, Lu Y, Lu J, et al. Prediction of enzyme’s family based on protein-protein interaction network. Curr Bioinform 2015; 10: 16-21.
[17]
Wu Y, Tang H, Chen W, Lin H. Predicting human enzyme family classes by using pseudo amino acid composition. Curr Proteomics 2016; 13: 99-104.
[18]
Bairoch A. The ENZYME database in 2000. Nucleic Acids Res 2000; 28: 304-5.
[19]
Bairoch A, Apweiler R. The SWISS-PROT protein sequence data bank and its supplement TrEMBL. Nucleic Acids Res 1997; 25: 31-6.
[20]
Cui T, Zhang L, Huang Y, et al. MNDR v2.0: an updated resource of ncRNA-disease associations in mammals. Nucleic Acids Res 2018; 46: D371-4.
[21]
Zhang T, Tan P, Wang L, et al. RNALocate: a resource for RNA subcellular localizations. Nucleic Acids Res 2017; 45: D135-8.
[22]
Yi Y, Zhao Y, Li C, et al. RAID v2.0: an updated resource of RNA-associated interactions across organisms. Nucleic Acids Res 2017; 45: D115-8.
[23]
Liang ZY, Lai HY, Yang H, et al. Pro54DB: a database for experimentally verified sigma-54 promoters. Bioinformatics 2017; 33: 467-9.
[24]
Feng P, Ding H, Lin H, Chen W. AOD: the antioxidant protein database. Sci Rep 2017; 7: 7449.
[25]
He B, Chai G, Duan Y, et al. BDB: biopanning data bank. Nucleic Acids Res 2016; 44: D1127-32.
[26]
Wang G, Dunbrack RL Jr. PISCES: a protein sequence culling server. Bioinformatics 2003; 19: 1589-91.
[27]
Zhu PP, Li WC, Zhong ZJ, et al. Predicting the subcellular localization of mycobacterial proteins by incorporating the optimal tripeptides into the general form of pseudo amino acid composition. Mol Biosyst 2015; 11: 558-63.
[28]
Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 2006; 22: 1658-9.
[29]
Huang Y, Niu B, Gao Y, Fu L, Li W. CD-HIT Suite: a web server for clustering and comparing biological sequences. Bioinformatics 2010; 26: 680-2.
[30]
Chou KC, Zhang CT. Predicting protein folding types by distance functions that make allowances for amino acid interactions. J Biol Chem 1994; 269: 22014-20.
[31]
Chou KC. A novel approach to predicting protein structural classes in a (20-1)-D amino acid composition space. Proteins 1995; 21: 319-44.
[32]
Lin H, Chen W. Prediction of thermophilic proteins using feature selection technique. J Microbiol Methods 2011; 84: 67-70.
[33]
Letunic I, Copley RR, Pils B, et al. SMART 5: domains in the context of genomes and networks. Nucleic Acids Res 2006; 34: D257-60.
[34]
Tatusov RL, Fedorova ND, Jackson JD, et al. The COG database: an updated version includes eukaryotes. BMC Bioinformatics 2003; 4: 41.
[35]
Marchler-Bauer A, Anderson JB, Derbyshire MK, et al. CDD: a conserved domain database for interactive domain family analysis. Nucleic Acids Res 2007; 35: D237-40.
[36]
Apweiler R, Attwood TK, Bairoch A, et al. The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Res 2001; 29: 37-40.
[37]
Chou KC. Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins 2001; 43: 246-55.
[38]
Sahu SS, Panda G. A novel feature representation method based on Chou’s pseudo amino acid composition for protein structural class prediction. Comput Biol Chem 2010; 34: 320-7.
[39]
Nanni L, Lumini A, Gupta D, Garg A. Identifying bacterial virulent proteins by fusing a set of classifiers based on variants of Chou's pseudo amino acid composition and on evolutionary informationIEEE/ACM Trans Comput Biol Bioinform 2012; 9: 467-75
[40]
Nanni L, Lumini A. Genetic programming for creating Chou’s pseudo amino acid based features for submitochondria localization. Amino Acids 2008; 34: 653-60.
[41]
Qiu JD, Huang JH, Liang RP, Lu XQ. Prediction of G-protein-coupled receptor classes based on the concept of Chou’s pseudo amino acid composition: an approach from discrete wavelet transform. Anal Biochem 2009; 390: 68-73.
[42]
Mohabatkar H, Mohammad Beigi M, Esmaeili A. Prediction of GABAA receptor proteins using the concept of Chou’s pseudo-amino acid composition and support vector machine. J Theor Biol 2011; 281: 18-23.
[43]
Mohabatkar H, Beigi MM, Abdolahi K, Mohsenzadeh S. Prediction of allergenic proteins by means of the concept of Chou’s pseudo amino acid composition and a machine learning approach. Med Chem 2013; 9: 133-7.
[44]
Hajisharifi Z, Piryaiee M, Mohammad Beigi M, Behbahani M, Mohabatkar H. Predicting anticancer peptides with Chou’s pseudo amino acid composition and investigating their mutagenicity via Ames test. J Theor Biol 2014; 341: 34-0.
[45]
Khosravian M, Faramarzi FK, Beigi MM, Behbahani M, Mohabatkar H. Predicting antibacterial peptides by the concept of Chou’s pseudo-amino acid composition and machine learning methods. Protein Pept Lett 2013; 20: 180-6.
[46]
Esmaeili M, Mohabatkar H, Mohsenzadeh S. Using the concept of Chou’s pseudo amino acid composition for risk type prediction of human papillomaviruses. J Theor Biol 2010; 263: 203-9.
[47]
Feng PM, Chen W, Lin H, Chou KC. iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition. Anal Biochem 2013; 442: 118-25.
[48]
Feng PM, Ding H, Chen W, Lin H. Naive Bayes classifier with feature selection to identify phage virion proteins. Comput Math Methods Med 2013; 2013: 530696.
[49]
Feng PM, Lin H, Chen W. Identification of antioxidants from sequence information using naive Bayes. Comput Math Methods Med 2013; 2013: 567529.
[50]
Yang H, Tang H, Chen XX, et al. Identification of Secretory Proteins in Mycobacterium tuberculosis Using Pseudo Amino Acid Composition. BioMed Res Int 2016; 2016: 5413903.
[51]
Chen XX, Tang H, Li WC, et al. Identification of bacterial cell wall lyases via pseudo amino acid composition. BioMed Res Int 2016; 2016: 1654623.
[52]
Tanford C. Contribution of hydrophobic interactions to the stability of the globular conformation of proteins. J Am Chem Soc 1962.
[53]
Hopp TP, Woods KR. Prediction of protein antigenic determinants from amino acid sequences. Proc Natl Acad Sci USA 1981; 78: 3824-8.
[54]
Chou KC, Cai YD. A new hybrid approach to predict subcellular localization of proteins by incorporating gene ontology. Biochem Biophys Res Commun 2003; 311: 743-7.
[55]
Schaffer AA, Aravind L, Madden TL, et al. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res 2001; 29: 2994-3005.
[56]
Laxton RR. The measure of diversity. J Theor Biol 1978; 70: 51-67.
[57]
Zhang L, Luo L. Splice site prediction with quadratic discriminant analysis using diversity measure. Nucleic Acids Res 2003; 31: 6214-20.
[58]
Li QZ, Lu ZQ. The prediction of the structural class of protein: application of the measure of diversity. J Theor Biol 2001; 213: 493-502.
[59]
Liu W, Chou KC. Prediction of protein secondary structure content. Protein Eng 1999; 12: 1041-50.
[60]
Weiss O, Herzel H. Correlations in protein sequences and property codes. J Theor Biol 1998; 190: 341-53.
[61]
Liu H, Wang M, Chou KC. Low-frequency Fourier spectrum for predicting membrane protein types. Biochem Biophys Res Commun 2005; 336: 737-9.
[62]
Chou KC. The biological functions of low-frequency vibrations (phonons). VI. A possible dynamic mechanism of allosteric transition in antibody molecules. Biopolymers 1987; 26: 285-95.
[63]
Chou KC. Biological functions of low-frequency vibrations (phonons). III. Helical structures and microenvironment. Biophys J 1984; 45: 881-9.
[64]
Chou KC. Low-frequency motions in protein molecules. Beta-sheet and beta-barrel. Biophys J 1985; 48: 289-97.
[65]
Chou KC. Low-frequency collective motion in biomacromolecules and its biological functions. Biophys Chem 1988; 30: 3-48.
[66]
Chou KC. Low-frequency resonance and cooperativity of hemoglobin. Trends Biochem Sci 1989; 14: 212-3.
[67]
Haimovich AD, Byrne B, Ramaswamy R, Welsh WJ. Wavelet analysis of DNA walks. J Comput Biol 2006; 13: 1289-98.
[68]
Turkheimer FE, Roncaroli F, Hennuy B, et al. Chromosomal patterns of gene expression from microarray data: methodology, validation and clinical relevance in gliomas. BMC Bioinformatics 2006; 7: 526.
[69]
Mandell A, Selz K, Shlesinger M. Wavelet transformation of protein hydrophobicity sequences suggests their memberships in structural familiesPhysical Physical A Statistical Mechanics Its Applications 1997; 244: 254-62
[70]
Li KB, Issac P, Krishnan A. Predicting allergenic proteins using wavelet transform. Bioinformatics 2004; 20: 2572-8.
[71]
Rezaei MA, Abdolmaleki P, Karami Z, et al. Prediction of membrane protein types by means of wavelet analysis and cascaded neural networks. J Theor Biol 2008; 254: 817-20.
[72]
Gonzalez-Diaz H, Gonzalez-Diaz Y, Santana L, Ubeira FM, Uriarte E. Proteomics, networks and connectivity indices. Proteomics 2008; 8: 750-78.
[73]
Concu R, Podda G, Uriarte E, Gonzalez-Diaz H. Computational chemistry study of 3D-structure-function relationships for enzymes based on Markov models for protein electrostatic, HINT, and van der Waals potentials. J Comput Chem 2009; 30: 1510-20.
[74]
Gonzalez-Diaz H, Prado-Prado F, Ubeira FM. Predicting antimicrobial drugs and targets with the MARCH-INSIDE approach. Curr Top Med Chem 2008; 8: 1676-90.
[75]
Li BQ, Zhang YH, Jin ML, Huang T, Cai YD. Prediction of Protein-Peptide Interactions with a Nearest Neighbor Algorithm. Curr Bioinform 2018; 13: 14-24.
[76]
Zhao W, Feng YE. Identify Protein 8-class secondary structure with quadratic discriminant algorithm based on the feature combination. Lett Org Chem 2017; 14: 625-31.
[77]
Yuan LZ, Yong EF, Wei Z, Shan KG. Using quadratic discriminant analysis to predict protein secondary structure based on chemical shifts. Curr Bioinform 2017; 12: 52-6.
[78]
Lin H, Li QZ. Using pseudo amino acid composition to predict protein structural class: approached by incorporating 400 dipeptide components. J Comput Chem 2007; 28: 1463-6.
[79]
Lin H. The modified mahalanobis discriminant for predicting outer membrane proteins by using chou’s pseudo amino acid composition. J Theor Biol 2008; 252: 350-6.
[80]
Lin H, Li QZ. Predicting conotoxin superfamily and family by using pseudo amino acid composition and modified Mahalanobis discriminant. Biochem Biophys Res Commun 2007; 354: 548-51.
[81]
Chou KC, Elrod DW. Prediction of enzyme family classes. J Proteome Res 2003; 2: 183-90.
[82]
Chou KC. Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes. Bioinformatics 2005; 21: 10-9.
[83]
Mahalanobis PC. On the generalised distance in statistic. Proc Natl Sci India 1936; 2: 49-35.
[84]
Zhou XB, Chen C, Li ZC, Zou XY. Using Chou’s amphiphilic pseudo-amino acid composition and support vector machine for prediction of enzyme subfamily classes. J Theor Biol 2007; 248: 546-51.
[85]
Dobson PD, Doig AJ. Predicting enzyme class from protein structure without alignments. J Mol Biol 2005; 345: 187-99.
[86]
Gaonkar B, Davatzikos C. Analytic estimation of statistical significance maps for support vector machine based multi-variate image analysis and classification. Neuroimage 2013; 78: 270-83.
[87]
Cuingnet R, Rosso C, Chupin M, et al. Spatial regularization of SVM for the detection of diffusion alterations associated with stroke outcome. Med Image Anal 2011; 15: 729-37.
[88]
Su ZD, Huang Y, Zhang ZY, et al. iLoc-lncRNA: predict the subcellular location of lncRNAs by incorporating octamer composition into general PseKNC. Bioinformatics 2018.
[http://dx.doi.org/10.1093/bioinformatics/bty508]
[89]
Feng P, Yang H, Ding H, Lin H, Chen W, Chou KC. iDNA6mA-PseKNC: Identifying DNA N(6)-methyladenosine sites by incorporating nucleotide physicochemical properties into PseKNC. Genomics 2018.
[http://dx.doi.org/10.1016/j.ygeno.2018.01.005]
[90]
Lin H, Liang ZY, Tang H, Chen W. Identifying sigma70 promoters with novel pseudo nucleotide compositionIEEE/ACM Trans Comput Biol Bioinform 2017, DOI: 101109/TCBB20172666141
[91]
Zhang J, Feng P, Lin H, Chen W. Identifying RNA N(6)-methyladenosine sites in escherichia coli genome. Front Microbiol 2018; 9: 955.
[92]
Chen W, Yang H, Feng P, Ding H, Lin H. iDNA4mC: identifying DNA N4-methylcytosine sites based on nucleotide chemical properties. Bioinformatics 2017; 33: 3518-23.
[93]
Yang H, Qiu WR, Liu G, et al. iRSpot-Pse6NC: Identifying recombination spots in Saccharomyces cerevisiae by incorporating hexamer composition into general PseKNC. Int J Biol Sci 2018; 14: 883-91.
[94]
Tang H, Zhao YW, Zou P, et al. HBPred: a tool to identify growth hormone-binding proteins. Int J Biol Sci 2018; 14: 957-64.
[95]
Qiu WR, Sun BQ, Tang H, Huang J, Lin H. Identify and analysis crotonylation sites in histone by using support vector machines. Artif Intell Med 2017; 83: 75-81.
[96]
Zhao YW, Su ZD, Yang W, et al. Ionchanpred 2.0: a tool to predict ion channels and their types. Int J Mol Sci 2017; 18: 1838.
[97]
Manavalan B, Shin TH, Lee G. PVP-SVM: Sequence-Based prediction of phage virion proteins using a support vector machine. Front Microbiol 2018; 9: 476.
[98]
Manavalan B, Lee J. SVMQA: support-vector-machine-based protein single-model quality assessment. Bioinformatics 2017; 33: 2496-503.
[99]
Ye J, Chen W, Jin DC. Predicting the types of plant heat shock proteins. Lett Org Chem 2017; 14: 684-9.
[100]
Tang H, Zhang CM, Chen R, et al. Identification of secretory proteins of malaria parasite by feature selection technique. Lett Org Chem 2017; 14: 621-4.
[101]
Lei GC, Tang JJ, Du PF. Predicting s-sulfenylation sites using physicochemical properties differences. Lett Org Chem 2017; 14: 665-72.
[102]
Jiang LM, Liao ZJ, Su R, Wei LY. Improved identification of cytokines using feature selection techniques. Lett Org Chem 2017; 14: 632-41.
[103]
Loh SK, Low ST, Chai LE, et al. A Review of computational approaches to predict gene functions. Curr Bioinform 2018; 13: 373-86.
[104]
Yang H, Lv H, Ding H, Chen W, Lin H. iRNA-2OM: A sequence-based predictor for identifying 2′-O-methylation sites in Homo sapiens. J Comput Biol 2018.
[http://dx.doi.org/10.1089/cmb.2018.0004]
[105]
Wei L, Zhou C, Chen H, Song J, Su R. ACPred-FL: a sequence-based predictor based on effective feature representation to improve the prediction of anti-cancer peptides. Bioinformatics 2018.
[http://dx.doi.org/10.1093/bioinformatics/bty451]
[106]
Li DP, Ju Y, Zou Q. Protein folds prediction with hierarchical structured svm. Curr Proteomics 2016; 13: 79-85.
[107]
Bishop C. Pattern recognition and machine learning. Springer 2006.
[108]
Dao FY, Yang H, Su ZD, et al. Recent advances in conotoxin classification by using machine learning methods. Mol 2017; 22: 1057.
[109]
Song J, Wang Y, Li F, et al. iProt-Sub: a comprehensive package for accurately mapping and predicting protease-specific substrates and cleavage sites. Brief Bioinform 2018.
[http://dx.doi.org/10.1093/bib/bby028]
[110]
Song J, Li F, Leier A, et al. PROSPERous: high-throughput prediction of substrate cleavage sites for 90 proteases with improved accuracy. Bioinformatics 2018; 34: 684-7.
[111]
Li F, Li C, Marquez-Lago TT, et al. Quokka: a comprehensive tool for rapid and accurate prediction of kinase family-specific phosphorylation sites in the human proteome. Bioinformatics 2018; bty522.
[http://dx.doi.org/10.1093/bioinformatics]
[112]
Bao Y, Marini S, Tamura T, et al. Toward more accurate prediction of caspase cleavage sites: a comprehensive review of current methods, tools and features. Brief Bioinform 2018.
[http://dx.doi.org/10.1093/bib/bby041]
[113]
He WY, Jia CZ, Duan YC, Zou Q. 70ProPred: a predictor for discovering sigma70 promoters based on combining multiple features. BMC Syst Biol 2018; 12: 44.
[114]
Zou Q, Wan SX, Ju Y, Tang JJ, Zeng XX. Pretata: predicting TATA binding proteins with novel features and dimensionality reduction strategy. BMC Syst Biol 2016; 10: 114.
[115]
Cao RZ, Adhikari B, Bhattacharya D, et al. QAcon: single model quality assessment using protein structural and contact information with machine learning techniques. Bioinformatics 2017; 33: 586-8.
[116]
Cao R, Freitas C, Chan L, et al. ProLanGO: Protein function prediction using neural machine translation based on a recurrent neural network. Mol 2017; 22: E1732.
[117]
Cao RZ, Bhattacharya D, Hou J, Cheng JL. DeepQA: improving the estimation of single protein model quality with deep belief networks. BMC Bioinformatics 2016; 17: 495.
[118]
Tang H, Cao RZ, Wang W, et al. A two-step discriminated method to identify thermophilic proteins. Int J Biomath 2017; 10: 1750050.
[119]
Mohabatkar H. Prediction of cyclin proteins using Chou’s pseudo amino acid composition. Protein Pept Lett 2010; 17: 1207-14.
[120]
Chou KC, Wu ZC, Xiao X. iLoc-Hum: using the accumulation-label scale to predict subcellular locations of human proteins with both single and multiple sites. Mol Biosyst 2012; 8: 629-41.
[121]
Qin YF, Wang CH, Yu XQ, et al. Predicting protein structural class by incorporating patterns of over-represented k-mers into the general form of Chou’s PseAAC. Protein Pept Lett 2012; 19: 388-97.
[122]
Chou KC, Wu ZC, Xiao X. iLoc-Euk: a multi-label classifier for predicting the subcellular localization of singleplex and multiplex eukaryotic proteins. PLoS One 2011; 6: e18258.
[123]
Zhao XW, Ma ZQ, Yin MH. Predicting protein-protein interactions by combing various sequence- derived features into the general form of Chou’s Pseudo amino acid composition. Protein Pept Lett 2012; 19: 492-500.
[124]
Tang H, Chen W, Lin H. Identification of immunoglobulins using Chou’s pseudo amino acid composition with feature selection technique. Mol Biosyst 2016; 12: 1269-75.
[125]
Li WC, Deng EZ, Ding H, Chen W, Lin H. iORI-PseKNC: A predictor for identifying origin of replication with pseudo k-tuple nucleotide composition. Chemom Intell Lab Syst 2015; 141: 100-6.
[126]
Lin H, Deng EZ, Ding H, Chen W, Chou KC. iPro54-PseKNC: a sequence-based predictor for identifying sigma-54 promoters in prokaryote with pseudo k-tuple nucleotide composition. Nucleic Acids Res 2014; 42: 12961-72.
[127]
Ding H, Deng EZ, Yuan LF, et al. iCTX-type: a sequence-based predictor for identifying the types of conotoxins in targeting ion channels. BioMed Res Int 2014; 2014: 286419.
[128]
Manavalan B, Shin TH, Lee G. DHSpred: support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest. Oncotarget 2018; 9: 1944-56.
[129]
Manavalan B, Basith S, Shin TH, et al. MLACP: machine-learning-based prediction of anticancer peptides. Oncotarget 2017; 8: 77121-36.
[130]
Lin YQ, Min XP, Li LL, et al. Using a machine-learning approach to predict discontinuous antibody-specific b-cell epitopes. Curr Bioinform 2017; 12: 406-15.
[131]
Lai HY, Chen XX, Chen W, Tang H, Lin H. Sequence-based predictive modeling to identify cancerlectins. Oncotarget 2017; 8: 28169-75.
[132]
Li BQ, Hu LL, Niu S, Cai YD, Chou KC. Predict and analyze S-nitrosylation modification sites with the mRMR and IFS approaches. J Proteomics 2012; 75: 1654-65.
[133]
Ho TK. The random subspace method for constructing decision forests. IEEE Transactoins on Pattrern Analysis & Machine Intelligence 1998.
[134]
Voelz VA, Shell MS, Dill KA. Predicting peptide structures in native proteins from physical simulations of fragments. PLOS Comput Biol 2009; 5: e1000281.
[135]
Lin C, Chen W, Qiu C, et al. LibD3C: Ensemble classifiers with a clustering and dynamic selection strategy. Neurocomputing 2014; 123: 424-35.
[136]
Peng L, Peng MM, Liao B, et al. The advances and challenges of deep learning application in biological big data processing. Curr Bioinform 2018; 13: 352-9.
[137]
Patel S, Tripathi R, Kumari V, Varadwaj P. DeepInteract: deep neural network based protein-protein interaction prediction tool. Curr Bioinform 2017; 12: 551-7.
[138]
Long HX, Wang M, Fu HY. Deep convolutional neural networks for predicting hydroxyproline in proteins. Curr Bioinform 2017; 12: 233-8.
[139]
Chen W, Lin H, Feng PM, et al. iNuc-PhysChem: a sequence-based predictor for identifying nucleosomes via physicochemical properties. PLoS One 2012; 7: e47843.
[140]
Ding C, Peng H. Minimum redundancy feature selection from microarray gene expression data. J Bioinform Comput Biol 2005; 3: 185-205.
[141]
Naseem I, Khan S, Togneri R, Bennamoun M. ECMSRC: A sparse learning approach for the prediction of extracellular matrix proteins. Curr Bioinform 2017; 12: 361-8.

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy