Recent Development of Computational Predicting Bioluminescent Proteins

Dan       Zhang; Zheng-Xing       Guan; Zi-Mei       Zhang; Shi-Hao       Li; Fu-Ying       Dao; Hua       Tang; Hao       Lin
Abstract

Bioluminescent Proteins (BLPs) are widely distributed in many living organisms that act as a key role of light emission in bioluminescence. Bioluminescence serves various functions in finding food and protecting the organisms from predators. With the routine biotechnological application of bioluminescence, it is recognized to be essential for many medical, commercial and other general technological advances. Therefore, the prediction and characterization of BLPs are significant and can help to explore more secrets about bioluminescence and promote the development of application of bioluminescence. Since the experimental methods are money and time-consuming for BLPs identification, bioinformatics tools have played important role in fast and accurate prediction of BLPs by combining their sequences information with machine learning methods. In this review, we summarized and compared the application of machine learning methods in the prediction of BLPs from different aspects. We wish that this review will provide insights and inspirations for researches on BLPs.
Keywords: Bioluminescent proteins, machine learning methods, sequence-derived features, feature analysis, bioinformatics tools.
« Previous Next »
[1] 
Wilson T, Hastings JW. Bioluminescence. Annu Rev Cell Dev Biol  1998; 14: 197-230.
[http://dx.doi.org/10.1146/annurev.cellbio.14.1.197] [PMID:  9891783] 
[2] 
Brodl E, Winkler A, Macheroux P. Molecular mechanisms of bacterial bioluminescence. Comput Struct Biotechnol J  2018; 16: 551-64.
[http://dx.doi.org/10.1016/j.csbj.2018.11.003] [PMID:  30546856] 
[3] 
Haddock SH, Moline MA, Case JF. Bioluminescence in the sea. Annu Rev Mar Sci  2010; 2: 443-93.
[http://dx.doi.org/10.1146/annurev-marine-120308-081028] [PMID:  21141672] 
[4] 
Rowe L, Dikici E, Daunert S. Engineering bioluminescent proteins: expanding their analytical potential. Anal Chem  2009; 81(21): 8662-8.
[http://dx.doi.org/10.1021/ac9007286] [PMID:  19725502] 
[5] 
Ohmiya Y, Hirano T. Shining the light: the mechanism of the bioluminescence reaction of calcium-binding photoproteins. Chem Biol  1996; 3(5): 337-47.
[http://dx.doi.org/10.1016/S1074-5521(96)90116-7] [PMID:  8807862] 
[6] 
Branchini BR, Rosenberg JC, Fontaine DM, Southworth TL, Behney CE, Uzasci L. Bioluminescence is produced from a trapped firefly luciferase conformation predicted by the domain alternation mechanism. J Am Chem Soc  2011; 133(29): 11088-91.
[http://dx.doi.org/10.1021/ja2041496] [PMID:  21707059] 
[7] 
Lee J. Perspectives on bioluminescence mechanisms. Photochem Photobiol  2017; 93(2): 389-404.
[http://dx.doi.org/10.1111/php.12650] [PMID:  27748947] 
[8] 
Oba Y, Schultz DT. Eco-evo bioluminescence on land and in the sea. Adv Biochem Eng Biotechnol  2014; 144: 3-36.
[http://dx.doi.org/10.1007/978-3-662-43385-0_1] [PMID:  25084993] 
[9] 
Sharifian S, Homaei A, Hemmati R, Khajeh K. Light emission miracle in the sea and preeminent applications of bioluminescence in recent new biotechnology. J Photochem Photobiol B  2017; 172: 115-28.
[http://dx.doi.org/10.1016/j.jphotobiol.2017.05.021] [PMID:  28549320] 
[10] 
Mirasoli M, Michelini E. Analytical bioluminescence and chemiluminescence. Anal Bioanal Chem  2014; 406(23): 5529-30.
[http://dx.doi.org/10.1007/s00216-014-7992-4] [PMID:  25012355] 
[11] 
Shimomura O, Johnson FH, Saiga Y. Extraction, purification and properties of aequorin, a bioluminescent protein from the luminous hydromedusan, aequorea. J Cell Comp Physiol  1962; 59: 223-39.
[http://dx.doi.org/10.1002/jcp.1030590302] [PMID:  13911999] 
[12] 
Vidi PA, Watts VJ. Fluorescent and bioluminescent protein-fragment complementation assays in the study of G protein-coupled receptor oligomerization and signaling. Mol Pharmacol  2009; 75(4): 733-9.
[http://dx.doi.org/10.1124/mol.108.053819] [PMID:  19141658] 
[13] 
Kandaswamy KK, Pugalenthi G, Hazrati MK, Kalies KU, Martinetz T. BLProt: prediction of bioluminescent proteins based on support vector machine and relieff feature selection. BMC Bioinformatics  2011; 12: 345.
[http://dx.doi.org/10.1186/1471-2105-12-345] [PMID:  21849049] 
[14] 
Zhao X, Li J, Huang Y, Ma Z, Yin M. Prediction of bioluminescent proteins using auto covariance transformation of evolutional profiles. Int J Mol Sci  2012; 13(3): 3650-60.
[http://dx.doi.org/10.3390/ijms13033650] [PMID:  22489173] 
[15] 
Fan GL, Li QZ. Discriminating bioluminescent proteins by incorporating average chemical shift and evolutionary information into the general form of Chou’s pseudo amino acid composition. J Theor Biol  2013; 334: 45-51.
[http://dx.doi.org/10.1016/j.jtbi.2013.06.003] [PMID:  23770403] 
[16] 
Huang HL. Propensity scores for prediction and characterization of bioluminescent proteins from sequences. PLoS One  2014; 9(5) e97158
[http://dx.doi.org/10.1371/journal.pone.0097158] [PMID:  24828431] 
[17] 
Nath A, Subbiah K. Unsupervised learning assisted robust prediction of bioluminescent proteins. Comput Biol Med  2016; 68: 27-36.
[http://dx.doi.org/10.1016/j.compbiomed.2015.10.013] [PMID:  26599828] 
[18] 
Jia C, Zuo Y, Zou Q. O-GlcNAcPRED-II: an integrated classification algorithm for identifying O-GlcNAcylation sites based on fuzzy undersampling and a K-means PCA oversampling technique. Bioinformatics  2018; 34(12): 2029-36.
[http://dx.doi.org/10.1093/bioinformatics/bty039] [PMID:  29420699] 
[19] 
Zhang J, Chai H, Yang G, Ma Z. Prediction of bioluminescent proteins by using sequence-derived features and lineage-specific scheme. BMC Bioinformatics  2017; 18(1): 294.
[http://dx.doi.org/10.1186/s12859-017-1709-6] [PMID:  28583090] 
[20] 
Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics  2006; 22(13): 1658-9.
[http://dx.doi.org/10.1093/bioinformatics/btl158] [PMID:  16731699] 
[21] 
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol  1990; 215(3): 403-10.
[http://dx.doi.org/10.1016/S0022-2836(05)80360-2] [PMID:  2231712] 
[22] 
Zou Q, Lin G, Jiang X, Liu X, Zeng X. Sequence clustering in bioinformatics: an empirical study. Brief Bioinform 2018.
[http://dx.doi.org/10.1093/bib/bby090] [PMID:  30239587] 
[23] 
Cedano J, Aloy P, Pérez-Pons JA, Querol E. Relation between amino acid composition and cellular location of proteins. J Mol Biol  1997; 266(3): 594-600.
[http://dx.doi.org/10.1006/jmbi.1996.0804] [PMID:  9067612] 
[24] 
Zhang CT, Chou KC. An analysis of protein folding type prediction by seed-propagated sampling and jackknife test. J Protein Chem  1995; 14(7): 583-93.
[http://dx.doi.org/10.1007/BF01886884] [PMID:  8561854] 
[25] 
Zou Q, Xing P, Wei L, Liu B. Gene2vec: gene subsequence embedding for prediction of mammalian N6-methyladenosine sites from mRNA. RNA  2019; 25(2): 205-18.
[http://dx.doi.org/10.1261/rna.069112.118] [PMID:  30425123] 
[26] 
Chen W, Lv H, Nie F, Lin H. i6mA-Pred: identifying DNA N6-methyladenine sites in the rice genome. Bioinformatics  2019; 35(16): 2796-800.
[http://dx.doi.org/10.1093/bioinformatics/btz015] [PMID:  30624619] 
[27] 
Feng PM, Chen W, Lin H, Chou KC. iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition. Anal Biochem  2013; 442(1): 118-25.
[http://dx.doi.org/10.1016/j.ab.2013.05.024] [PMID:  23756733] 
[28] 
Chou KC. Some remarks on protein attribute prediction and pseudo amino acid composition. J Theor Biol  2011; 273(1): 236-47.
[http://dx.doi.org/10.1016/j.jtbi.2010.12.024] [PMID:  21168420] 
[29] 
Xu Y, Ding J, Wu LY. iSulf-Cys: prediction of s-sulfenylation sites in proteins with physicochemical properties of amino acids. PLoS One  2016; 11(4) e0154237
[http://dx.doi.org/10.1371/journal.pone.0154237] [PMID:  27104833] 
[30] 
Cao R, Freitas C, Chan L, Sun M, Jiang H, Chen Z. ProLanGO: protein function prediction using neural machine translation based on a recurrent neural network. Molecules  2017; 22(10) e1732
[http://dx.doi.org/10.3390/molecules22101732] [PMID:  29039790] 
[31] 
Cao R, Cheng J. Integrated protein function prediction by mining function associations, sequences, and protein-protein and gene-gene interaction networks. Methods  2016; 93: 84-91.
[http://dx.doi.org/10.1016/j.ymeth.2015.09.011] [PMID:  26370280] 
[32] 
Kawashima S, Pokarowski P, Pokarowska M, Kolinski A, Katayama T, Kanehisa M. AAindex: amino acid index database, progress report 2008. Nucleic Acids Res  2008; 36(Database issue): d202-5.
[PMID:  17998252] 
[33] 
Zheng LL, Niu S, Hao P, Feng K, Cai YD, Li Y. Prediction of protein modification sites of pyrrolidone carboxylic acid using mRMR feature selection and analysis. PLoS One  2011; 6(12) e28221
[http://dx.doi.org/10.1371/journal.pone.0028221] [PMID:  22174779] 
[34] 
Zhao YW, Lai HY, Tang H, Chen W, Lin H. Prediction of phosphothreonine sites in human proteins by fusing different features. Sci Rep  2016; 6: 34817.
[http://dx.doi.org/10.1038/srep34817] [PMID:  27698459] 
[35] 
Lin H, Chen W. Prediction of thermophilic proteins using feature selection technique. J Microbiol Methods  2011; 84(1): 67-70.
[http://dx.doi.org/10.1016/j.mimet.2010.10.013] [PMID:  21044646] 
[36] 
Cao R, Cheng J. Protein single-model quality assessment by feature-based probability density functions. Sci Rep  2016; 6: 23990.
[http://dx.doi.org/10.1038/srep23990] [PMID:  27041353] 
[37] 
Jahandideh S, Abdolmaleki P, Jahandideh M, Barzegari Asadabadi E. Sequence and structural parameters enhancing adaptation of proteins to low temperatures. J Theor Biol  2007; 246(1): 159-66.
[http://dx.doi.org/10.1016/j.jtbi.2006.12.008] [PMID:  17275036] 
[38] 
Metpally RP, Reddy BV. Comparative proteome analysis of psychrophilic versus mesophilic bacterial species: Insights into the molecular basis of cold adaptation of proteins. BMC Genomics  2009; 10: 11.
[http://dx.doi.org/10.1186/1471-2164-10-11] [PMID:  19133128] 
[39] 
Nath A, Chaube R, Subbiah K. An insight into the molecular basis for convergent evolution in fish antifreeze Proteins. Comput Biol Med  2013; 43(7): 817-21.
[http://dx.doi.org/10.1016/j.compbiomed.2013.04.013] [PMID:  23746722] 
[40] 
Feng PM, Ding H, Chen W, Lin H. Naïve bayes classifier with feature selection to identify phage virion proteins. Comput Math Methods Med  2013; 2013 530696
[http://dx.doi.org/10.1155/2013/530696] [PMID:  23762187] 
[41] 
Chen W, Feng P, Liu T, Jin D. Recent advances in machine learning methods for predicting heat shock proteins. Curr Drug Metab  2019; 20(3): 224-8.
[PMID:  30378494] 
[42] 
Ding H, Deng EZ, Yuan LF, et al. iCTX-type: a sequence-based predictor for identifying the types of conotoxins in targeting ion channels. BioMed Res Int  2014; 2014286419
[http://dx.doi.org/10.1155/2014/286419] [PMID:  24991545] 
[43] 
Tan J-X, Li S-H, Zhang Z-M, et al. Identification of hormone binding proteins based on machine learning methods. Math Biosci Eng  2019; 16(4): 2466-80.
[http://dx.doi.org/10.3934/mbe.2019123] 
[44] 
Jones DT. Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol  1999; 292(2): 195-202.
[http://dx.doi.org/10.1006/jmbi.1999.3091] [PMID:  10493868] 
[45] 
Kaur H, Raghava GP. Prediction of beta-turns in proteins from multiple alignment using neural network. Protein Sci  2003; 12(3): 627-34.
[http://dx.doi.org/10.1110/ps.0228903] 
[46] 
Pu X, Guo J, Leung H, Lin Y. Prediction of membrane protein types from sequences and position-specific scoring matrices. J Theor Biol  2007; 247(2): 259-65.
[http://dx.doi.org/10.1016/j.jtbi.2007.01.016] [PMID:  17433369] 
[47] 
Chou KC, Shen HB. MemType-2L: a web server for predicting membrane proteins and their types by incorporating evolution information through Pse-PSSM. Biochem Biophys Res Commun  2007; 360(2): 339-45.
[http://dx.doi.org/10.1016/j.bbrc.2007.06.027] [PMID:  17586467] 
[48] 
Xie D, Li A, Wang M, Fan Z, Feng H. LOCSVMPSI: a web server for subcellular localization of eukaryotic proteins using SVM and profile of PSI-BLAST. Nucleic Acids Res   2005; 33(Web Server issue): w105-110.
[http://dx.doi.org/10.1093/nar/gki359] 
[49] 
Altschul SF, Madden TL, Schäffer AA, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res  1997; 25(17): 3389-402.
[http://dx.doi.org/10.1093/nar/25.17.3389] [PMID:  9254694] 
[50] 
Schäffer AA, Aravind L, Madden TL, et al. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res  2001; 29(14): 2994-3005.
[http://dx.doi.org/10.1093/nar/29.14.2994] [PMID:  11452024] 
[51] 
Yang L, Li Y, Xiao R, et al. Using auto covariance method for functional discrimination of membrane proteins based on evolution information. Amino Acids  2010; 38(5): 1497-503.
[http://dx.doi.org/10.1007/s00726-009-0362-4] [PMID:  19820894] 
[52] 
Guo Y, Li M, Lu M, Wen Z, Huang Z. Predicting G-protein coupled receptors-G-protein coupling specificity based on autocross-covariance transform. Proteins  2006; 65(1): 55-60.
[http://dx.doi.org/10.1002/prot.21097] [PMID:  16865706] 
[53] 
Doytchinova IA, Flower DR. VaxiJen: a server for prediction of protective antigens, tumour antigens and subunit vaccines. BMC Bioinformatics  2007; 8: 4.
[http://dx.doi.org/10.1186/1471-2105-8-4] [PMID:  17207271] 
[54] 
Chou KC. Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins  2001; 43(3): 246-55.
[http://dx.doi.org/10.1002/prot.1035] [PMID:  11288174] 
[55] 
Fraser RS, Willey T. The effect of cardiopulmonary bypass on digitalis tolerance in dogs. Acta Cardiol  1969; 24(2): 184-92.
[PMID:  5308540] 
[56] 
Kumar R, Srivastava A, Kumari B, Kumar M. Prediction of β-lactamase and its class by chou’s pseudo-amino acid composition and support vector machine. J Theor Biol  2015; 365: 96-103.
[http://dx.doi.org/10.1016/j.jtbi.2014.10.008] [PMID:  25454009] 
[57] 
Tang H, Chen W, Lin H. Identification of immunoglobulins using chou’s pseudo amino acid composition with feature selection technique. Mol Biosyst  2016; 12(4): 1269-75.
[http://dx.doi.org/10.1039/C5MB00883B] [PMID:  26883492] 
[58] 
Yang H, Tang H, Chen XX, et al. Identification of secretory proteins in mycobacterium tuberculosis using pseudo amino acid composition. BioMed Res Int  2016; 20165413903
[http://dx.doi.org/10.1155/2016/5413903] [PMID:  27597968] 
[59] 
Chen XX, Tang H, Li WC, et al. Identification of bacterial cell wall lyases via pseudo amino acid composition. BioMed Res Int  2016; 20161654623
[http://dx.doi.org/10.1155/2016/1654623] [PMID:  27437396] 
[60] 
Shen HB, Chou KC. EzyPred: a top-down approach for predicting enzyme functional classes and subclasses. Biochem Biophys Res Commun  2007; 364(1): 53-9.
[http://dx.doi.org/10.1016/j.bbrc.2007.09.098] [PMID:  17931599] 
[61] 
Suzuki Y, Yamazaki T, Aoki A, Shindo H, Asakura T. NMR study of the structures of repeated sequences, GAGXGA (X = S, Y, V), in Bombyx mori liquid silk. Biomacromolecules  2014; 15(1): 104-12.
[http://dx.doi.org/10.1021/bm401346h] [PMID:  24266784] 
[62] 
Wishart DS, Case DA. Use of chemical shifts in macromolecular structure determination. Methods Enzymol  2001; 338: 3-34.
[http://dx.doi.org/10.1016/S0076-6879(02)38214-4] [PMID:  11460554] 
[63] 
Case DA. The use of chemical shifts and their anisotropies in biomolecular structure determination. Curr Opin Struct Biol  1998; 8(5): 624-30.
[http://dx.doi.org/10.1016/S0959-440X(98)80155-3] [PMID:  9818268] 
[64] 
Cavalli A, Salvatella X, Dobson CM, Vendruscolo M. Protein structure determination from NMR chemical shifts. Proc Natl Acad Sci USA  2007; 104(23): 9615-20.
[http://dx.doi.org/10.1073/pnas.0610313104] [PMID:  17535901] 
[65] 
Mechelke M, Habeck M. A probabilistic model for secondary structure prediction from protein chemical shifts. Proteins  2013; 81(6): 984-93.
[http://dx.doi.org/10.1002/prot.24249] [PMID:  23292699] 
[66] 
Mao W, Cong P, Wang Z, Lu L, Zhu Z. Li T. NMRDSP: an accurate prediction of protein shape strings from NMR chemical shifts and sequence data. PLoS One  2013; 8(12)e83532
[http://dx.doi.org/10.1371/journal.pone.0083532] [PMID:  24376713] 
[67] 
Shen Y, Bax A. Protein backbone and sidechain torsion angles predicted from NMR chemical shifts using artificial neural networks. J Biomol NMR  2013; 56(3): 227-41.
[http://dx.doi.org/10.1007/s10858-013-9741-y] [PMID:  23728592] 
[68] 
Lin H, Ding C, Song Q, et al. The prediction of protein structural class using averaged chemical shifts. J Biomol Struct Dyn  2012; 29(6): 643-9.
[http://dx.doi.org/10.1080/07391102.2011.672628] [PMID:  22545995] 
[69] 
Lee HC, Hon T, Lan C, Zhang L. Structural environment dictates the biological significance of heme-responsive motifs and the role of Hsp90 in the activation of the heme activator protein Hap1. Mol Cell Biol  2003; 23(16): 5857-66.
[http://dx.doi.org/10.1128/MCB.23.16.5857-5866.2003] [PMID:  12897155] 
[70] 
Ishikawa H, Kato M, Hori H, et al. Involvement of heme regulatory motif in heme-mediated ubiquitination and degradation of IRP2. Mol Cell  2005; 19(2): 171-81.
[http://dx.doi.org/10.1016/j.molcel.2005.05.027] [PMID:  16039587] 
[71] 
Igarashi J, Murase M, Iizuka A, Pichierri F, Martinkova M, Shimizu T. Elucidation of the heme binding site of heme-regulated eukaryotic initiation factor 2alpha kinase and the role of the regulatory motif in heme sensing by spectroscopic and catalytic studies of mutant proteins. J Biol Chem  2008; 283(27): 18782-91.
[http://dx.doi.org/10.1074/jbc.M801400200] [PMID:  18450746] 
[72] 
Yi L, Jenkins PM, Leichert LI, Jakob U, Martens JR, Ragsdale SW. Heme regulatory motifs in heme oxygenase-2 form a thiol/disulfide redox switch that responds to the cellular redox state. J Biol Chem  2009; 284(31): 20556-61.
[http://dx.doi.org/10.1074/jbc.M109.015651] [PMID:  19473966] 
[73] 
Jacomin AC, Samavedam S, Charles H, Nezis IP. iLIR@viral: a web resource for LIR motif-containing proteins in viruses. Autophagy  2017; 13(10): 1782-9.
[http://dx.doi.org/10.1080/15548627.2017.1356978] [PMID:  28806134] 
[74] 
Gajecka M, Pavlicek A, Glotzbach CD, et al. Identification of sequence motifs at the breakpoint junctions in three t(1;9)(p36.3;q34) and delineation of mechanisms involved in generating balanced translocations. Hum Genet  2006; 120(4): 519-26.
[http://dx.doi.org/10.1007/s00439-006-0222-1] [PMID:  16847692] 
[75] 
Zhu Y, Neeman T, Yap VB, Huttley GA. Statistical methods for identifying sequence motifs affecting point mutations. Genetics  2017; 205(2): 843-56.
[http://dx.doi.org/10.1534/genetics.116.195677] [PMID:  27974498] 
[76] 
Dhar J, Chakrabarti P. Structural motif, topi and its role in protein function and fibrillation. Molecular omics  2018; 14(4): 247-56.
[http://dx.doi.org/10.1039/C8MO00048D] 
[77] 
Ding C, Yuan LF, Guo SH, Lin H, Chen W. Identification of mycobacterial membrane proteins and their types using over-represented tripeptide compositions. J Proteomics  2012; 77: 321-8.
[http://dx.doi.org/10.1016/j.jprot.2012.09.006] [PMID:  23000219] 
[78] 
Rocchi L, Chiari L, Cappello A. Feature selection of stabilometric parameters based on principal component analysis. Med Biol Eng Comput  2004; 42(1): 71-9.
[http://dx.doi.org/10.1007/BF02351013] [PMID:  14977225] 
[79] 
Singh T, Ghosh A, Khandelwal N. Dimensional reduction and feature selection: principal component analysis for data mining. Radiology  2017; 285(3): 1055-6.
[http://dx.doi.org/10.1148/radiol.2017171604] [PMID:  29155626] 
[80] 
Ho SY, Hsieh CH, Yu FC, Huang HL. An intelligent two-stage evolutionary algorithm for dynamic pathway identification from gene expression profiles. IEEE/ACM Trans Comput Biol Bioinformatics  2007; 4(4): 648-60.
[http://dx.doi.org/10.1109/tcbb.2007.1051] [PMID:  17975275] 
[81] 
Peng H, Long F, Ding C. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell  2005; 27(8): 1226-38.
[http://dx.doi.org/10.1109/TPAMI.2005.159] [PMID:  16119262] 
[82] 
Liu Y, Gu W, Zhang W, Wang J. Predict and analyze protein glycation sites with the mRMR and IFS methods. BioMed Res Int  2015; 2015561547
[http://dx.doi.org/10.1155/2015/561547] [PMID:  25961025] 
[83] 
Dao FY, Lv H, Wang F, et al. Identify origin of replication in Saccharomyces cerevisiae using two-step feature selection technique. Bioinformatics  2019; 35(12): 2075-83.
[PMID:  30428009] 
[84] 
Zou Q, Wan S, Ju Y, Tang J, Zeng X. Pretata: predicting TATA binding proteins with novel features and dimensionality reduction strategy. BMC Syst Biol  2016; 10(Suppl. 4): 114.
[http://dx.doi.org/10.1186/s12918-016-0353-5] [PMID:  28155714] 
[85] 
Chen W, Feng P, Ding H, Lin H. Classifying included and excluded exons in exon skipping event using histone modifications. Front Genet  2018; 9: 433.
[http://dx.doi.org/10.3389/fgene.2018.00433] [PMID:  30327665] 
[86] 
Lin H, Ding H. Predicting ion channels and their types by the dipeptide mode of pseudo amino acid composition. J Theor Biol  2011; 269(1): 64-9.
[http://dx.doi.org/10.1016/j.jtbi.2010.10.019] [PMID:  20969879] 
[87] 
McHugh ML. Multiple comparison analysis testing in ANOVA. Biochem Med (Zagreb)  2011; 21(3): 203-9.
[http://dx.doi.org/10.11613/BM.2011.029] [PMID:  22420233] 
[88] 
Feng CQ, Zhang ZY, Zhu XJ, et al. iTerm-PseKNC: a sequence-based tool for predicting bacterial transcriptional terminators. Bioinformatics  2019; 35(9): 1469-77.
[PMID:  30247625] 
[89] 
Wang L, Xi Y, Sung S, Qiao H. RNA-seq assistant: machine learning based methods to identify more transcriptional regulated genes. BMC Genomics  2018; 19(1): 546.
[http://dx.doi.org/10.1186/s12864-018-4932-2] [PMID:  30029596] 
[90] 
Cheng Q, Zhou H, Cheng J. The fisher-markov selector: fast selecting maximally separable feature subset for multiclass classification with applications to high-dimensional data. IEEE Trans Pattern Anal Mach Intell  2011; 33(6): 1217-33.
[http://dx.doi.org/10.1109/TPAMI.2010.195] [PMID:  21493968] 
[91] 
Zhou P, Lowery MM, Englehart KB, et al. Decoding a new neural machine interface for control of artificial limbs. J Neurophysiol  2007; 98(5): 2974-82.
[http://dx.doi.org/10.1152/jn.00178.2007] [PMID:  17728391] 
[92] 
Luts J, Ojeda F, Van de Plas R, De Moor B, Van Huffel S, Suykens JA. A tutorial on support vector machine-based methods for classification problems in chemometrics. Anal Chim Acta  2010; 665(2): 129-45.
[http://dx.doi.org/10.1016/j.aca.2010.03.030] [PMID:  20417323] 
[93] 
Manoochehri Z, Salari N, Rezaei M, Khazaie H, Manoochehri S, Pavah BK. Comparison of support vector machine based on genetic algorithm with logistic regression to diagnose obstructive sleep apnea. J Res Med Sci  2018; 23: 65.
[94] 
Srivastava A, Kumar R, Kumar M. BlaPred: predicting and classifying β-lactamase using a 3-tier prediction system via chou’s general PseAAC. J Theor Biol  2018; 457: 29-36.
[http://dx.doi.org/10.1016/j.jtbi.2018.08.030] [PMID:  30138632] 
[95] 
Chen W, Yang H, Feng P, Ding H, Lin H. iDNA4mC: identifying DNA N4-methylcytosine sites based on nucleotide chemical properties. Bioinformatics  2017; 33(22): 3518-23.
[http://dx.doi.org/10.1093/bioinformatics/btx479] [PMID:  28961687] 
[96] 
Stephenson N, Shane E, Chase J, et al. Survey of machine learning techniques in drug discovery. Curr Drug Metab  2019; 20(3): 185-93.
[97] 
Tang H, Cao RZ, Wang W, Liu TS, Wang LM, He CM. A two-step discriminated method to identify thermophilic proteins. Int J Biomath  2017; 10(4)1750050
[http://dx.doi.org/10.1142/S1793524517500504] 
[98] 
Liu B, Zhang D, Xu R, et al. Combining evolutionary information extracted from frequency profiles with sequence-based kernels for protein remote homology detection. Bioinformatics  2014; 30(4): 472-9.
[http://dx.doi.org/10.1093/bioinformatics/btt709] [PMID:  24318998] 
[99] 
Ru B, Hoen PA, Nie F, Lin H, Guo FB, Huang J. PhD7Faster: predicting clones propagating faster from the Ph.D.-7 phage display peptide library. J Bioinform Comput Biol  2014; 12(1)1450005
[http://dx.doi.org/10.1142/S021972001450005X] [PMID:  24467763] 
[100] 
Yang H, Lv H, Ding H, Chen W, Lin H. iRNA-2OM: a sequence-based predictor for identifying 2′-O-Methylation sites in homo sapiens journal of computational biology. J Comput Mol Cell Biol  2018; 25(11): 1266-77.
[101] 
Zhou XB, Chen C, Li ZC, Zou XY. Using chou’s amphiphilic pseudo-amino acid composition and support vector machine for prediction of enzyme subfamily classes. J Theor Biol  2007; 248(3): 546-51.
[http://dx.doi.org/10.1016/j.jtbi.2007.06.001] [PMID:  17628605] 
[102] 
Pugalenthi G, Kumar KK, Suganthan PN, Gangal R. Identification of catalytic residues from protein structure using support vector machine with sequence and structural features. Biochem Biophys Res Commun  2008; 367(3): 630-4.
[http://dx.doi.org/10.1016/j.bbrc.2008.01.038] [PMID:  18206645] 
[103] 
Orlando G, Raimondi D, Khan T, Lenaerts T, Vranken WF. SVM-dependent pairwise HMM: an application to protein pairwise alignments. Bioinformatics  2017; 33(24): 3902-8.
[http://dx.doi.org/10.1093/bioinformatics/btx391] [PMID:  28666322] 
[104] 
Rahman MS, Rahman MK, Kaykobad M, Rahman MS. isGPT: an optimized model to identify sub-Golgi protein types using SVM and random forest based feature selection. Artif Intell Med  2018; 84: 90-100.
[http://dx.doi.org/10.1016/j.artmed.2017.11.003] [PMID:  29183738] 
[105] 
Li D, Ju Y, Zou Q. Protein folds prediction with hierarchical structured SVM. Curr Proteomics  2016; 13(2)
[http://dx.doi.org/10.2174/157016461302160514000940] 
[106] 
Tang H, Zhao YW, Zou P, et al. HBPred: a tool to identify growth hormone-binding proteins. Int J Biol Sci  2018; 14(8): 957-64.
[http://dx.doi.org/10.7150/ijbs.24174] [PMID:  29989085] 
[107] 
Zhu X-J, Feng C-Q, Lai H-Y, Chen W, Hao L. Predicting protein structural classes for low-similarity sequences by evaluating different features. Knowl Base Syst  2019; 163: 787-93.
[http://dx.doi.org/10.1016/j.knosys.2018.10.007] 
[108] 
Huang HL, Charoenkwan P, Kao TF, et al. Prediction and analysis of protein solubility using a novel scoring card method with dipeptide composition. BMC Bioinformatics  2012; 13(Suppl. 17): s3.
[PMID:  23282103] 
[109] 
Charoenkwan P, Shoombuatong W, Lee HC, Chaijaruwanich J, Huang HL, Ho SY. SCMCRYS: predicting protein crystallization using an ensemble scoring card method with estimating propensity scores of P-collocated amino acid pairs. PLoS One  2013; 8(9) e72368
[http://dx.doi.org/10.1371/journal.pone.0072368] [PMID:  24019868] 
[110] 
Blagus R, Lusa L. Boosting for high-dimensional two-class prediction. BMC Bioinformatics  2015; 16: 300.
[http://dx.doi.org/10.1186/s12859-015-0723-9] [PMID:  26390865] 
[111] 
Mayr A, Binder H, Gefeller O, Schmid M. The evolution of boosting algorithms. From machine learning to statistical modelling. Methods Inf Med  2014; 53(6): 419-27.
[http://dx.doi.org/10.3414/ME13-01-0122] [PMID:  25112367] 
[112] 
Mayr A, Hofner B, Waldmann E, Hepp T, Meyer S, Gefeller O. An update on statistical boosting in biomedicine. Comput Math Methods Med  2017; 20176083072
[http://dx.doi.org/10.1155/2017/6083072] [PMID:  28831290] 
[113] 
Rigatti SJ. Random Forest. J Insur Med  2017; 47(1): 31-9.
[http://dx.doi.org/10.17849/insm-47-01-31-39.1] [PMID:  28836909] 
[114] 
Zhang CJ, Tang H, Li WC, Lin H, Chen W, Chou KC. iOri-Human: identify human origin of replication by incorporating dinucleotide physicochemical properties into pseudo nucleotide composition. Oncotarget  2016; 7(43): 69783-93.
[http://dx.doi.org/10.18632/oncotarget.11975] [PMID:  27626500] 
[115] 
Lv H, Zhang ZM, Li SH, Tan JX, Chen W, Lin H. Evaluation of different computational methods on 5-methylcytosine sites identification. Brief Bioinform 2019.bbz048
[http://dx.doi.org/10.1093/bib/bbz048] [PMID:  31157855] 
[116] 
Jung Y, Hu J. A
                        K-fold averaging cross-validation procedure. J Nonparametr Stat  2015; 27(2): 167-79.
[http://dx.doi.org/10.1080/10485252.2015.1010532] [PMID:  27630515] 
[117] 
Lai HY, Chen XX, Chen W, Tang H, Lin H. Sequence-based predictive modeling to identify cancerlectins. Oncotarget  2017; 8(17): 28169-75.
[http://dx.doi.org/10.18632/oncotarget.15963] [PMID:  28423655] 
[118] 
Feng PM, Lin H, Chen W. Identification of antioxidants from sequence information using naïve bayes. Comput Math Methods Med  2013; 2013 567529
[http://dx.doi.org/10.1155/2013/567529] [PMID:  24062796] 
[119] 
Kubik-Komar A, Kubera E, Piotrowska-Weryszko K. Selection of morphological features of pollen grains for chosen tree taxa. Biol Open  2018; 7(5) bio031237
[http://dx.doi.org/10.1242/bio.031237] [PMID:  29643087] 
[120] 
Shrestha DL, Solomatine DP. Experiments with AdaBoost.RT, an improved boosting scheme for regression. Neural Comput  2006; 18(7): 1678-710.
[http://dx.doi.org/10.1162/neco.2006.18.7.1678] [PMID:  16764518] 
Rights & Permissions Print Cite
Article Metrics
23
2
Journal Information
For Authors
For Editors
For Reviewers
Explore Articles
Open Access
Open Access Articles
For Visitors
DOI https://dx.doi.org/10.2174/1381612825666191107100758	Print ISSN 1381-6128
Publisher Name Bentham Science Publisher	Online ISSN 1873-4286
Current Pharmaceutical Design

Recent Development of Computational Predicting Bioluminescent Proteins

Abstract

"Tuberculosis Prevention, Diagnosis and Drug Discovery"

Current Pharmaceutical challenges in the treatment and diagnosis of neurological dysfunctions

Emerging and re-emerging diseases

Melanoma and Non-Melanoma Skin Cancer Treatment: Standard of Care and Recent Advances

Current Pharmaceutical Design

Recent Development of Computational Predicting Bioluminescent Proteins

Abstract

Call for Papers in Thematic Issues

"Tuberculosis Prevention, Diagnosis and Drug Discovery"

Current Pharmaceutical challenges in the treatment and diagnosis of neurological dysfunctions

Emerging and re-emerging diseases

Melanoma and Non-Melanoma Skin Cancer Treatment: Standard of Care and Recent Advances

Related Journals

Related Books