Recent Development of Computational Predicting Bioluminescent Proteins

Author(s): Dan Zhang, Zheng-Xing Guan, Zi-Mei Zhang, Shi-Hao Li, Fu-Ying Dao, Hua Tang*, Hao Lin*.

Journal Name: Current Pharmaceutical Design

Volume 25 , Issue 40 , 2019

Become EABM
Become Reviewer

Abstract:

Bioluminescent Proteins (BLPs) are widely distributed in many living organisms that act as a key role of light emission in bioluminescence. Bioluminescence serves various functions in finding food and protecting the organisms from predators. With the routine biotechnological application of bioluminescence, it is recognized to be essential for many medical, commercial and other general technological advances. Therefore, the prediction and characterization of BLPs are significant and can help to explore more secrets about bioluminescence and promote the development of application of bioluminescence. Since the experimental methods are money and time-consuming for BLPs identification, bioinformatics tools have played important role in fast and accurate prediction of BLPs by combining their sequences information with machine learning methods. In this review, we summarized and compared the application of machine learning methods in the prediction of BLPs from different aspects. We wish that this review will provide insights and inspirations for researches on BLPs.

Keywords: Bioluminescent proteins, machine learning methods, sequence-derived features, feature analysis, bioinformatics tools.

[1]
Wilson T, Hastings JW. Bioluminescence. Annu Rev Cell Dev Biol 1998; 14: 197-230.
[http://dx.doi.org/10.1146/annurev.cellbio.14.1.197] [PMID: 9891783]
[2]
Brodl E, Winkler A, Macheroux P. Molecular mechanisms of bacterial bioluminescence. Comput Struct Biotechnol J 2018; 16: 551-64.
[http://dx.doi.org/10.1016/j.csbj.2018.11.003] [PMID: 30546856]
[3]
Haddock SH, Moline MA, Case JF. Bioluminescence in the sea. Annu Rev Mar Sci 2010; 2: 443-93.
[http://dx.doi.org/10.1146/annurev-marine-120308-081028] [PMID: 21141672]
[4]
Rowe L, Dikici E, Daunert S. Engineering bioluminescent proteins: expanding their analytical potential. Anal Chem 2009; 81(21): 8662-8.
[http://dx.doi.org/10.1021/ac9007286] [PMID: 19725502]
[5]
Ohmiya Y, Hirano T. Shining the light: the mechanism of the bioluminescence reaction of calcium-binding photoproteins. Chem Biol 1996; 3(5): 337-47.
[http://dx.doi.org/10.1016/S1074-5521(96)90116-7] [PMID: 8807862]
[6]
Branchini BR, Rosenberg JC, Fontaine DM, Southworth TL, Behney CE, Uzasci L. Bioluminescence is produced from a trapped firefly luciferase conformation predicted by the domain alternation mechanism. J Am Chem Soc 2011; 133(29): 11088-91.
[http://dx.doi.org/10.1021/ja2041496] [PMID: 21707059]
[7]
Lee J. Perspectives on bioluminescence mechanisms. Photochem Photobiol 2017; 93(2): 389-404.
[http://dx.doi.org/10.1111/php.12650] [PMID: 27748947]
[8]
Oba Y, Schultz DT. Eco-evo bioluminescence on land and in the sea. Adv Biochem Eng Biotechnol 2014; 144: 3-36.
[http://dx.doi.org/10.1007/978-3-662-43385-0_1] [PMID: 25084993]
[9]
Sharifian S, Homaei A, Hemmati R, Khajeh K. Light emission miracle in the sea and preeminent applications of bioluminescence in recent new biotechnology. J Photochem Photobiol B 2017; 172: 115-28.
[http://dx.doi.org/10.1016/j.jphotobiol.2017.05.021] [PMID: 28549320]
[10]
Mirasoli M, Michelini E. Analytical bioluminescence and chemiluminescence. Anal Bioanal Chem 2014; 406(23): 5529-30.
[http://dx.doi.org/10.1007/s00216-014-7992-4] [PMID: 25012355]
[11]
Shimomura O, Johnson FH, Saiga Y. Extraction, purification and properties of aequorin, a bioluminescent protein from the luminous hydromedusan, aequorea. J Cell Comp Physiol 1962; 59: 223-39.
[http://dx.doi.org/10.1002/jcp.1030590302] [PMID: 13911999]
[12]
Vidi PA, Watts VJ. Fluorescent and bioluminescent protein-fragment complementation assays in the study of G protein-coupled receptor oligomerization and signaling. Mol Pharmacol 2009; 75(4): 733-9.
[http://dx.doi.org/10.1124/mol.108.053819] [PMID: 19141658]
[13]
Kandaswamy KK, Pugalenthi G, Hazrati MK, Kalies KU, Martinetz T. BLProt: prediction of bioluminescent proteins based on support vector machine and relieff feature selection. BMC Bioinformatics 2011; 12: 345.
[http://dx.doi.org/10.1186/1471-2105-12-345] [PMID: 21849049]
[14]
Zhao X, Li J, Huang Y, Ma Z, Yin M. Prediction of bioluminescent proteins using auto covariance transformation of evolutional profiles. Int J Mol Sci 2012; 13(3): 3650-60.
[http://dx.doi.org/10.3390/ijms13033650] [PMID: 22489173]
[15]
Fan GL, Li QZ. Discriminating bioluminescent proteins by incorporating average chemical shift and evolutionary information into the general form of Chou’s pseudo amino acid composition. J Theor Biol 2013; 334: 45-51.
[http://dx.doi.org/10.1016/j.jtbi.2013.06.003] [PMID: 23770403]
[16]
Huang HL. Propensity scores for prediction and characterization of bioluminescent proteins from sequences. PLoS One 2014; 9(5) e97158
[http://dx.doi.org/10.1371/journal.pone.0097158] [PMID: 24828431]
[17]
Nath A, Subbiah K. Unsupervised learning assisted robust prediction of bioluminescent proteins. Comput Biol Med 2016; 68: 27-36.
[http://dx.doi.org/10.1016/j.compbiomed.2015.10.013] [PMID: 26599828]
[18]
Jia C, Zuo Y, Zou Q. O-GlcNAcPRED-II: an integrated classification algorithm for identifying O-GlcNAcylation sites based on fuzzy undersampling and a K-means PCA oversampling technique. Bioinformatics 2018; 34(12): 2029-36.
[http://dx.doi.org/10.1093/bioinformatics/bty039] [PMID: 29420699]
[19]
Zhang J, Chai H, Yang G, Ma Z. Prediction of bioluminescent proteins by using sequence-derived features and lineage-specific scheme. BMC Bioinformatics 2017; 18(1): 294.
[http://dx.doi.org/10.1186/s12859-017-1709-6] [PMID: 28583090]
[20]
Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 2006; 22(13): 1658-9.
[http://dx.doi.org/10.1093/bioinformatics/btl158] [PMID: 16731699]
[21]
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol 1990; 215(3): 403-10.
[http://dx.doi.org/10.1016/S0022-2836(05)80360-2] [PMID: 2231712]
[22]
Zou Q, Lin G, Jiang X, Liu X, Zeng X. Sequence clustering in bioinformatics: an empirical study. Brief Bioinform 2018.
[http://dx.doi.org/10.1093/bib/bby090] [PMID: 30239587]
[23]
Cedano J, Aloy P, Pérez-Pons JA, Querol E. Relation between amino acid composition and cellular location of proteins. J Mol Biol 1997; 266(3): 594-600.
[http://dx.doi.org/10.1006/jmbi.1996.0804] [PMID: 9067612]
[24]
Zhang CT, Chou KC. An analysis of protein folding type prediction by seed-propagated sampling and jackknife test. J Protein Chem 1995; 14(7): 583-93.
[http://dx.doi.org/10.1007/BF01886884] [PMID: 8561854]
[25]
Zou Q, Xing P, Wei L, Liu B. Gene2vec: gene subsequence embedding for prediction of mammalian N6-methyladenosine sites from mRNA. RNA 2019; 25(2): 205-18.
[http://dx.doi.org/10.1261/rna.069112.118] [PMID: 30425123]
[26]
Chen W, Lv H, Nie F, Lin H. i6mA-Pred: identifying DNA N6-methyladenine sites in the rice genome. Bioinformatics 2019; 35(16): 2796-800.
[http://dx.doi.org/10.1093/bioinformatics/btz015] [PMID: 30624619]
[27]
Feng PM, Chen W, Lin H, Chou KC. iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition. Anal Biochem 2013; 442(1): 118-25.
[http://dx.doi.org/10.1016/j.ab.2013.05.024] [PMID: 23756733]
[28]
Chou KC. Some remarks on protein attribute prediction and pseudo amino acid composition. J Theor Biol 2011; 273(1): 236-47.
[http://dx.doi.org/10.1016/j.jtbi.2010.12.024] [PMID: 21168420]
[29]
Xu Y, Ding J, Wu LY. iSulf-Cys: prediction of s-sulfenylation sites in proteins with physicochemical properties of amino acids. PLoS One 2016; 11(4) e0154237
[http://dx.doi.org/10.1371/journal.pone.0154237] [PMID: 27104833]
[30]
Cao R, Freitas C, Chan L, Sun M, Jiang H, Chen Z. ProLanGO: protein function prediction using neural machine translation based on a recurrent neural network. Molecules 2017; 22(10) e1732
[http://dx.doi.org/10.3390/molecules22101732] [PMID: 29039790]
[31]
Cao R, Cheng J. Integrated protein function prediction by mining function associations, sequences, and protein-protein and gene-gene interaction networks. Methods 2016; 93: 84-91.
[http://dx.doi.org/10.1016/j.ymeth.2015.09.011] [PMID: 26370280]
[32]
Kawashima S, Pokarowski P, Pokarowska M, Kolinski A, Katayama T, Kanehisa M. AAindex: amino acid index database, progress report 2008. Nucleic Acids Res 2008; 36(Database issue): d202-5.
[PMID: 17998252]
[33]
Zheng LL, Niu S, Hao P, Feng K, Cai YD, Li Y. Prediction of protein modification sites of pyrrolidone carboxylic acid using mRMR feature selection and analysis. PLoS One 2011; 6(12) e28221
[http://dx.doi.org/10.1371/journal.pone.0028221] [PMID: 22174779]
[34]
Zhao YW, Lai HY, Tang H, Chen W, Lin H. Prediction of phosphothreonine sites in human proteins by fusing different features. Sci Rep 2016; 6: 34817.
[http://dx.doi.org/10.1038/srep34817] [PMID: 27698459]
[35]
Lin H, Chen W. Prediction of thermophilic proteins using feature selection technique. J Microbiol Methods 2011; 84(1): 67-70.
[http://dx.doi.org/10.1016/j.mimet.2010.10.013] [PMID: 21044646]
[36]
Cao R, Cheng J. Protein single-model quality assessment by feature-based probability density functions. Sci Rep 2016; 6: 23990.
[http://dx.doi.org/10.1038/srep23990] [PMID: 27041353]
[37]
Jahandideh S, Abdolmaleki P, Jahandideh M, Barzegari Asadabadi E. Sequence and structural parameters enhancing adaptation of proteins to low temperatures. J Theor Biol 2007; 246(1): 159-66.
[http://dx.doi.org/10.1016/j.jtbi.2006.12.008] [PMID: 17275036]
[38]
Metpally RP, Reddy BV. Comparative proteome analysis of psychrophilic versus mesophilic bacterial species: Insights into the molecular basis of cold adaptation of proteins. BMC Genomics 2009; 10: 11.
[http://dx.doi.org/10.1186/1471-2164-10-11] [PMID: 19133128]
[39]
Nath A, Chaube R, Subbiah K. An insight into the molecular basis for convergent evolution in fish antifreeze Proteins. Comput Biol Med 2013; 43(7): 817-21.
[http://dx.doi.org/10.1016/j.compbiomed.2013.04.013] [PMID: 23746722]
[40]
Feng PM, Ding H, Chen W, Lin H. Naïve bayes classifier with feature selection to identify phage virion proteins. Comput Math Methods Med 2013; 2013 530696
[http://dx.doi.org/10.1155/2013/530696] [PMID: 23762187]
[41]
Chen W, Feng P, Liu T, Jin D. Recent advances in machine learning methods for predicting heat shock proteins. Curr Drug Metab 2019; 20(3): 224-8.
[PMID: 30378494]
[42]
Ding H, Deng EZ, Yuan LF, et al. iCTX-type: a sequence-based predictor for identifying the types of conotoxins in targeting ion channels. BioMed Res Int 2014; 2014286419
[http://dx.doi.org/10.1155/2014/286419] [PMID: 24991545]
[43]
Tan J-X, Li S-H, Zhang Z-M, et al. Identification of hormone binding proteins based on machine learning methods. Math Biosci Eng 2019; 16(4): 2466-80.
[http://dx.doi.org/10.3934/mbe.2019123]
[44]
Jones DT. Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol 1999; 292(2): 195-202.
[http://dx.doi.org/10.1006/jmbi.1999.3091] [PMID: 10493868]
[45]
Kaur H, Raghava GP. Prediction of beta-turns in proteins from multiple alignment using neural network. Protein Sci 2003; 12(3): 627-34.
[http://dx.doi.org/10.1110/ps.0228903]
[46]
Pu X, Guo J, Leung H, Lin Y. Prediction of membrane protein types from sequences and position-specific scoring matrices. J Theor Biol 2007; 247(2): 259-65.
[http://dx.doi.org/10.1016/j.jtbi.2007.01.016] [PMID: 17433369]
[47]
Chou KC, Shen HB. MemType-2L: a web server for predicting membrane proteins and their types by incorporating evolution information through Pse-PSSM. Biochem Biophys Res Commun 2007; 360(2): 339-45.
[http://dx.doi.org/10.1016/j.bbrc.2007.06.027] [PMID: 17586467]
[48]
Xie D, Li A, Wang M, Fan Z, Feng H. LOCSVMPSI: a web server for subcellular localization of eukaryotic proteins using SVM and profile of PSI-BLAST. Nucleic Acids Res 2005; 33(Web Server issue): w105-110.
[http://dx.doi.org/10.1093/nar/gki359]
[49]
Altschul SF, Madden TL, Schäffer AA, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997; 25(17): 3389-402.
[http://dx.doi.org/10.1093/nar/25.17.3389] [PMID: 9254694]
[50]
Schäffer AA, Aravind L, Madden TL, et al. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res 2001; 29(14): 2994-3005.
[http://dx.doi.org/10.1093/nar/29.14.2994] [PMID: 11452024]
[51]
Yang L, Li Y, Xiao R, et al. Using auto covariance method for functional discrimination of membrane proteins based on evolution information. Amino Acids 2010; 38(5): 1497-503.
[http://dx.doi.org/10.1007/s00726-009-0362-4] [PMID: 19820894]
[52]
Guo Y, Li M, Lu M, Wen Z, Huang Z. Predicting G-protein coupled receptors-G-protein coupling specificity based on autocross-covariance transform. Proteins 2006; 65(1): 55-60.
[http://dx.doi.org/10.1002/prot.21097] [PMID: 16865706]
[53]
Doytchinova IA, Flower DR. VaxiJen: a server for prediction of protective antigens, tumour antigens and subunit vaccines. BMC Bioinformatics 2007; 8: 4.
[http://dx.doi.org/10.1186/1471-2105-8-4] [PMID: 17207271]
[54]
Chou KC. Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins 2001; 43(3): 246-55.
[http://dx.doi.org/10.1002/prot.1035] [PMID: 11288174]
[55]
Fraser RS, Willey T. The effect of cardiopulmonary bypass on digitalis tolerance in dogs. Acta Cardiol 1969; 24(2): 184-92.
[PMID: 5308540]
[56]
Kumar R, Srivastava A, Kumari B, Kumar M. Prediction of β-lactamase and its class by chou’s pseudo-amino acid composition and support vector machine. J Theor Biol 2015; 365: 96-103.
[http://dx.doi.org/10.1016/j.jtbi.2014.10.008] [PMID: 25454009]
[57]
Tang H, Chen W, Lin H. Identification of immunoglobulins using chou’s pseudo amino acid composition with feature selection technique. Mol Biosyst 2016; 12(4): 1269-75.
[http://dx.doi.org/10.1039/C5MB00883B] [PMID: 26883492]
[58]
Yang H, Tang H, Chen XX, et al. Identification of secretory proteins in mycobacterium tuberculosis using pseudo amino acid composition. BioMed Res Int 2016; 20165413903
[http://dx.doi.org/10.1155/2016/5413903] [PMID: 27597968]
[59]
Chen XX, Tang H, Li WC, et al. Identification of bacterial cell wall lyases via pseudo amino acid composition. BioMed Res Int 2016; 20161654623
[http://dx.doi.org/10.1155/2016/1654623] [PMID: 27437396]
[60]
Shen HB, Chou KC. EzyPred: a top-down approach for predicting enzyme functional classes and subclasses. Biochem Biophys Res Commun 2007; 364(1): 53-9.
[http://dx.doi.org/10.1016/j.bbrc.2007.09.098] [PMID: 17931599]
[61]
Suzuki Y, Yamazaki T, Aoki A, Shindo H, Asakura T. NMR study of the structures of repeated sequences, GAGXGA (X = S, Y, V), in Bombyx mori liquid silk. Biomacromolecules 2014; 15(1): 104-12.
[http://dx.doi.org/10.1021/bm401346h] [PMID: 24266784]
[62]
Wishart DS, Case DA. Use of chemical shifts in macromolecular structure determination. Methods Enzymol 2001; 338: 3-34.
[http://dx.doi.org/10.1016/S0076-6879(02)38214-4] [PMID: 11460554]
[63]
Case DA. The use of chemical shifts and their anisotropies in biomolecular structure determination. Curr Opin Struct Biol 1998; 8(5): 624-30.
[http://dx.doi.org/10.1016/S0959-440X(98)80155-3] [PMID: 9818268]
[64]
Cavalli A, Salvatella X, Dobson CM, Vendruscolo M. Protein structure determination from NMR chemical shifts. Proc Natl Acad Sci USA 2007; 104(23): 9615-20.
[http://dx.doi.org/10.1073/pnas.0610313104] [PMID: 17535901]
[65]
Mechelke M, Habeck M. A probabilistic model for secondary structure prediction from protein chemical shifts. Proteins 2013; 81(6): 984-93.
[http://dx.doi.org/10.1002/prot.24249] [PMID: 23292699]
[66]
Mao W, Cong P, Wang Z, Lu L, Zhu Z. Li T. NMRDSP: an accurate prediction of protein shape strings from NMR chemical shifts and sequence data. PLoS One 2013; 8(12)e83532
[http://dx.doi.org/10.1371/journal.pone.0083532] [PMID: 24376713]
[67]
Shen Y, Bax A. Protein backbone and sidechain torsion angles predicted from NMR chemical shifts using artificial neural networks. J Biomol NMR 2013; 56(3): 227-41.
[http://dx.doi.org/10.1007/s10858-013-9741-y] [PMID: 23728592]
[68]
Lin H, Ding C, Song Q, et al. The prediction of protein structural class using averaged chemical shifts. J Biomol Struct Dyn 2012; 29(6): 643-9.
[http://dx.doi.org/10.1080/07391102.2011.672628] [PMID: 22545995]
[69]
Lee HC, Hon T, Lan C, Zhang L. Structural environment dictates the biological significance of heme-responsive motifs and the role of Hsp90 in the activation of the heme activator protein Hap1. Mol Cell Biol 2003; 23(16): 5857-66.
[http://dx.doi.org/10.1128/MCB.23.16.5857-5866.2003] [PMID: 12897155]
[70]
Ishikawa H, Kato M, Hori H, et al. Involvement of heme regulatory motif in heme-mediated ubiquitination and degradation of IRP2. Mol Cell 2005; 19(2): 171-81.
[http://dx.doi.org/10.1016/j.molcel.2005.05.027] [PMID: 16039587]
[71]
Igarashi J, Murase M, Iizuka A, Pichierri F, Martinkova M, Shimizu T. Elucidation of the heme binding site of heme-regulated eukaryotic initiation factor 2alpha kinase and the role of the regulatory motif in heme sensing by spectroscopic and catalytic studies of mutant proteins. J Biol Chem 2008; 283(27): 18782-91.
[http://dx.doi.org/10.1074/jbc.M801400200] [PMID: 18450746]
[72]
Yi L, Jenkins PM, Leichert LI, Jakob U, Martens JR, Ragsdale SW. Heme regulatory motifs in heme oxygenase-2 form a thiol/disulfide redox switch that responds to the cellular redox state. J Biol Chem 2009; 284(31): 20556-61.
[http://dx.doi.org/10.1074/jbc.M109.015651] [PMID: 19473966]
[73]
Jacomin AC, Samavedam S, Charles H, Nezis IP. iLIR@viral: a web resource for LIR motif-containing proteins in viruses. Autophagy 2017; 13(10): 1782-9.
[http://dx.doi.org/10.1080/15548627.2017.1356978] [PMID: 28806134]
[74]
Gajecka M, Pavlicek A, Glotzbach CD, et al. Identification of sequence motifs at the breakpoint junctions in three t(1;9)(p36.3;q34) and delineation of mechanisms involved in generating balanced translocations. Hum Genet 2006; 120(4): 519-26.
[http://dx.doi.org/10.1007/s00439-006-0222-1] [PMID: 16847692]
[75]
Zhu Y, Neeman T, Yap VB, Huttley GA. Statistical methods for identifying sequence motifs affecting point mutations. Genetics 2017; 205(2): 843-56.
[http://dx.doi.org/10.1534/genetics.116.195677] [PMID: 27974498]
[76]
Dhar J, Chakrabarti P. Structural motif, topi and its role in protein function and fibrillation. Molecular omics 2018; 14(4): 247-56.
[http://dx.doi.org/10.1039/C8MO00048D]
[77]
Ding C, Yuan LF, Guo SH, Lin H, Chen W. Identification of mycobacterial membrane proteins and their types using over-represented tripeptide compositions. J Proteomics 2012; 77: 321-8.
[http://dx.doi.org/10.1016/j.jprot.2012.09.006] [PMID: 23000219]
[78]
Rocchi L, Chiari L, Cappello A. Feature selection of stabilometric parameters based on principal component analysis. Med Biol Eng Comput 2004; 42(1): 71-9.
[http://dx.doi.org/10.1007/BF02351013] [PMID: 14977225]
[79]
Singh T, Ghosh A, Khandelwal N. Dimensional reduction and feature selection: principal component analysis for data mining. Radiology 2017; 285(3): 1055-6.
[http://dx.doi.org/10.1148/radiol.2017171604] [PMID: 29155626]
[80]
Ho SY, Hsieh CH, Yu FC, Huang HL. An intelligent two-stage evolutionary algorithm for dynamic pathway identification from gene expression profiles. IEEE/ACM Trans Comput Biol Bioinformatics 2007; 4(4): 648-60.
[http://dx.doi.org/10.1109/tcbb.2007.1051] [PMID: 17975275]
[81]
Peng H, Long F, Ding C. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 2005; 27(8): 1226-38.
[http://dx.doi.org/10.1109/TPAMI.2005.159] [PMID: 16119262]
[82]
Liu Y, Gu W, Zhang W, Wang J. Predict and analyze protein glycation sites with the mRMR and IFS methods. BioMed Res Int 2015; 2015561547
[http://dx.doi.org/10.1155/2015/561547] [PMID: 25961025]
[83]
Dao FY, Lv H, Wang F, et al. Identify origin of replication in Saccharomyces cerevisiae using two-step feature selection technique. Bioinformatics 2019; 35(12): 2075-83.
[PMID: 30428009]
[84]
Zou Q, Wan S, Ju Y, Tang J, Zeng X. Pretata: predicting TATA binding proteins with novel features and dimensionality reduction strategy. BMC Syst Biol 2016; 10(Suppl. 4): 114.
[http://dx.doi.org/10.1186/s12918-016-0353-5] [PMID: 28155714]
[85]
Chen W, Feng P, Ding H, Lin H. Classifying included and excluded exons in exon skipping event using histone modifications. Front Genet 2018; 9: 433.
[http://dx.doi.org/10.3389/fgene.2018.00433] [PMID: 30327665]
[86]
Lin H, Ding H. Predicting ion channels and their types by the dipeptide mode of pseudo amino acid composition. J Theor Biol 2011; 269(1): 64-9.
[http://dx.doi.org/10.1016/j.jtbi.2010.10.019] [PMID: 20969879]
[87]
McHugh ML. Multiple comparison analysis testing in ANOVA. Biochem Med (Zagreb) 2011; 21(3): 203-9.
[http://dx.doi.org/10.11613/BM.2011.029] [PMID: 22420233]
[88]
Feng CQ, Zhang ZY, Zhu XJ, et al. iTerm-PseKNC: a sequence-based tool for predicting bacterial transcriptional terminators. Bioinformatics 2019; 35(9): 1469-77.
[PMID: 30247625]
[89]
Wang L, Xi Y, Sung S, Qiao H. RNA-seq assistant: machine learning based methods to identify more transcriptional regulated genes. BMC Genomics 2018; 19(1): 546.
[http://dx.doi.org/10.1186/s12864-018-4932-2] [PMID: 30029596]
[90]
Cheng Q, Zhou H, Cheng J. The fisher-markov selector: fast selecting maximally separable feature subset for multiclass classification with applications to high-dimensional data. IEEE Trans Pattern Anal Mach Intell 2011; 33(6): 1217-33.
[http://dx.doi.org/10.1109/TPAMI.2010.195] [PMID: 21493968]
[91]
Zhou P, Lowery MM, Englehart KB, et al. Decoding a new neural machine interface for control of artificial limbs. J Neurophysiol 2007; 98(5): 2974-82.
[http://dx.doi.org/10.1152/jn.00178.2007] [PMID: 17728391]
[92]
Luts J, Ojeda F, Van de Plas R, De Moor B, Van Huffel S, Suykens JA. A tutorial on support vector machine-based methods for classification problems in chemometrics. Anal Chim Acta 2010; 665(2): 129-45.
[http://dx.doi.org/10.1016/j.aca.2010.03.030] [PMID: 20417323]
[93]
Manoochehri Z, Salari N, Rezaei M, Khazaie H, Manoochehri S, Pavah BK. Comparison of support vector machine based on genetic algorithm with logistic regression to diagnose obstructive sleep apnea. J Res Med Sci 2018; 23: 65.
[94]
Srivastava A, Kumar R, Kumar M. BlaPred: predicting and classifying β-lactamase using a 3-tier prediction system via chou’s general PseAAC. J Theor Biol 2018; 457: 29-36.
[http://dx.doi.org/10.1016/j.jtbi.2018.08.030] [PMID: 30138632]
[95]
Chen W, Yang H, Feng P, Ding H, Lin H. iDNA4mC: identifying DNA N4-methylcytosine sites based on nucleotide chemical properties. Bioinformatics 2017; 33(22): 3518-23.
[http://dx.doi.org/10.1093/bioinformatics/btx479] [PMID: 28961687]
[96]
Stephenson N, Shane E, Chase J, et al. Survey of machine learning techniques in drug discovery. Curr Drug Metab 2019; 20(3): 185-93.
[97]
Tang H, Cao RZ, Wang W, Liu TS, Wang LM, He CM. A two-step discriminated method to identify thermophilic proteins. Int J Biomath 2017; 10(4)1750050
[http://dx.doi.org/10.1142/S1793524517500504]
[98]
Liu B, Zhang D, Xu R, et al. Combining evolutionary information extracted from frequency profiles with sequence-based kernels for protein remote homology detection. Bioinformatics 2014; 30(4): 472-9.
[http://dx.doi.org/10.1093/bioinformatics/btt709] [PMID: 24318998]
[99]
Ru B, Hoen PA, Nie F, Lin H, Guo FB, Huang J. PhD7Faster: predicting clones propagating faster from the Ph.D.-7 phage display peptide library. J Bioinform Comput Biol 2014; 12(1)1450005
[http://dx.doi.org/10.1142/S021972001450005X] [PMID: 24467763]
[100]
Yang H, Lv H, Ding H, Chen W, Lin H. iRNA-2OM: a sequence-based predictor for identifying 2′-O-Methylation sites in homo sapiens journal of computational biology. J Comput Mol Cell Biol 2018; 25(11): 1266-77.
[101]
Zhou XB, Chen C, Li ZC, Zou XY. Using chou’s amphiphilic pseudo-amino acid composition and support vector machine for prediction of enzyme subfamily classes. J Theor Biol 2007; 248(3): 546-51.
[http://dx.doi.org/10.1016/j.jtbi.2007.06.001] [PMID: 17628605]
[102]
Pugalenthi G, Kumar KK, Suganthan PN, Gangal R. Identification of catalytic residues from protein structure using support vector machine with sequence and structural features. Biochem Biophys Res Commun 2008; 367(3): 630-4.
[http://dx.doi.org/10.1016/j.bbrc.2008.01.038] [PMID: 18206645]
[103]
Orlando G, Raimondi D, Khan T, Lenaerts T, Vranken WF. SVM-dependent pairwise HMM: an application to protein pairwise alignments. Bioinformatics 2017; 33(24): 3902-8.
[http://dx.doi.org/10.1093/bioinformatics/btx391] [PMID: 28666322]
[104]
Rahman MS, Rahman MK, Kaykobad M, Rahman MS. isGPT: an optimized model to identify sub-Golgi protein types using SVM and random forest based feature selection. Artif Intell Med 2018; 84: 90-100.
[http://dx.doi.org/10.1016/j.artmed.2017.11.003] [PMID: 29183738]
[105]
Li D, Ju Y, Zou Q. Protein folds prediction with hierarchical structured SVM. Curr Proteomics 2016; 13(2)
[http://dx.doi.org/10.2174/157016461302160514000940]
[106]
Tang H, Zhao YW, Zou P, et al. HBPred: a tool to identify growth hormone-binding proteins. Int J Biol Sci 2018; 14(8): 957-64.
[http://dx.doi.org/10.7150/ijbs.24174] [PMID: 29989085]
[107]
Zhu X-J, Feng C-Q, Lai H-Y, Chen W, Hao L. Predicting protein structural classes for low-similarity sequences by evaluating different features. Knowl Base Syst 2019; 163: 787-93.
[http://dx.doi.org/10.1016/j.knosys.2018.10.007]
[108]
Huang HL, Charoenkwan P, Kao TF, et al. Prediction and analysis of protein solubility using a novel scoring card method with dipeptide composition. BMC Bioinformatics 2012; 13(Suppl. 17): s3.
[PMID: 23282103]
[109]
Charoenkwan P, Shoombuatong W, Lee HC, Chaijaruwanich J, Huang HL, Ho SY. SCMCRYS: predicting protein crystallization using an ensemble scoring card method with estimating propensity scores of P-collocated amino acid pairs. PLoS One 2013; 8(9) e72368
[http://dx.doi.org/10.1371/journal.pone.0072368] [PMID: 24019868]
[110]
Blagus R, Lusa L. Boosting for high-dimensional two-class prediction. BMC Bioinformatics 2015; 16: 300.
[http://dx.doi.org/10.1186/s12859-015-0723-9] [PMID: 26390865]
[111]
Mayr A, Binder H, Gefeller O, Schmid M. The evolution of boosting algorithms. From machine learning to statistical modelling. Methods Inf Med 2014; 53(6): 419-27.
[http://dx.doi.org/10.3414/ME13-01-0122] [PMID: 25112367]
[112]
Mayr A, Hofner B, Waldmann E, Hepp T, Meyer S, Gefeller O. An update on statistical boosting in biomedicine. Comput Math Methods Med 2017; 20176083072
[http://dx.doi.org/10.1155/2017/6083072] [PMID: 28831290]
[113]
Rigatti SJ. Random Forest. J Insur Med 2017; 47(1): 31-9.
[http://dx.doi.org/10.17849/insm-47-01-31-39.1] [PMID: 28836909]
[114]
Zhang CJ, Tang H, Li WC, Lin H, Chen W, Chou KC. iOri-Human: identify human origin of replication by incorporating dinucleotide physicochemical properties into pseudo nucleotide composition. Oncotarget 2016; 7(43): 69783-93.
[http://dx.doi.org/10.18632/oncotarget.11975] [PMID: 27626500]
[115]
Lv H, Zhang ZM, Li SH, Tan JX, Chen W, Lin H. Evaluation of different computational methods on 5-methylcytosine sites identification. Brief Bioinform 2019.bbz048
[http://dx.doi.org/10.1093/bib/bbz048] [PMID: 31157855]
[116]
Jung Y, Hu J. A K-fold averaging cross-validation procedure. J Nonparametr Stat 2015; 27(2): 167-79.
[http://dx.doi.org/10.1080/10485252.2015.1010532] [PMID: 27630515]
[117]
Lai HY, Chen XX, Chen W, Tang H, Lin H. Sequence-based predictive modeling to identify cancerlectins. Oncotarget 2017; 8(17): 28169-75.
[http://dx.doi.org/10.18632/oncotarget.15963] [PMID: 28423655]
[118]
Feng PM, Lin H, Chen W. Identification of antioxidants from sequence information using naïve bayes. Comput Math Methods Med 2013; 2013 567529
[http://dx.doi.org/10.1155/2013/567529] [PMID: 24062796]
[119]
Kubik-Komar A, Kubera E, Piotrowska-Weryszko K. Selection of morphological features of pollen grains for chosen tree taxa. Biol Open 2018; 7(5) bio031237
[http://dx.doi.org/10.1242/bio.031237] [PMID: 29643087]
[120]
Shrestha DL, Solomatine DP. Experiments with AdaBoost.RT, an improved boosting scheme for regression. Neural Comput 2006; 18(7): 1678-710.
[http://dx.doi.org/10.1162/neco.2006.18.7.1678] [PMID: 16764518]


Rights & PermissionsPrintExport Cite as

Article Details

VOLUME: 25
ISSUE: 40
Year: 2019
Page: [4264 - 4273]
Pages: 10
DOI: 10.2174/1381612825666191107100758
Price: $65

Article Metrics

PDF: 15
HTML: 3