Generic placeholder image

Current Protein & Peptide Science

Editor-in-Chief

ISSN (Print): 1389-2037
ISSN (Online): 1875-5550

General Review Article

An Overview on Predicting Protein Subchloroplast Localization by using Machine Learning Methods

Author(s): Meng-Lu Liu , Wei Su , Zheng-Xing Guan , Dan Zhang , Wei Chen *, Li Liu * and Hui Ding *

Volume 21, Issue 12, 2020

Page: [1229 - 1241] Pages: 13

DOI: 10.2174/1389203721666200117153412

Price: $65

Abstract

The chloroplast is a type of subcellular organelle of green plants and eukaryotic algae, which plays an important role in the photosynthesis process. Since the function of a protein correlates with its location, knowing its subchloroplast localization is helpful for elucidating its functions. However, due to a large number of chloroplast proteins, it is costly and time-consuming to design biological experiments to recognize subchloroplast localizations of these proteins. To address this problem, during the past ten years, twelve computational prediction methods have been developed to predict protein subchloroplast localization. This review summarizes the research progress in this area. We hope the review could provide important guide for further computational study on protein subchloroplast localization.

Keywords: Protein, subchloroplast localization, machine learning method, protein sequence properties, feature selection, dataset.

Graphical Abstract
[1]
Kleffmann, T.; Russenberger, D.; von Zychlinski, A.; Christopher, W.; Sjölander, K.; Gruissem, W.; Baginsky, S. The Arabidopsis thaliana chloroplast proteome reveals pathway abundance and novel protein functions. Curr. Biol., 2004, 14(5), 354-362.
[http://dx.doi.org/10.1016/j.cub.2004.02.039] [PMID: 15028209]
[2]
Bryant, D.A.; Frigaard, N.U. Prokaryotic photosynthesis and phototrophy illuminated. Trends Microbiol., 2006, 14(11), 488-496.
[http://dx.doi.org/10.1016/j.tim.2006.09.001] [PMID: 16997562]
[3]
Post-Beittenmiller, D.; Roughan, G.; Ohlrogge, J.B. Regulation of plant Fatty Acid biosynthesis: analysis of acyl-coenzyme a and acyl-acyl carrier protein substrate pools in spinach and pea chloroplasts. Plant Physiol., 1992, 100(2), 923-930.
[http://dx.doi.org/10.1104/pp.100.2.923] [PMID: 16653077]
[4]
Kirk, P.R.; Leech, R.M. Amino Acid Biosynthesis by Isolated Chloroplasts during Photosynthesis. Plant Physiol., 1972, 50(2), 228-234.
[http://dx.doi.org/10.1104/pp.50.2.228] [PMID: 16658147]
[5]
Wang, Z.; Benning, C. Chloroplast lipid synthesis and lipid trafficking through ER-plastid membrane contact sites. Biochem. Soc. Trans., 2012, 40(2), 457-463.
[http://dx.doi.org/10.1042/BST20110752] [PMID: 22435830]
[6]
Melkikh, A.V.; Seleznev, V.D.; Chesnokova, O.I. Analytical model of ion transport and conversion of light energy in chloroplasts. J. Theor. Biol., 2010, 264(3), 702-710.
[http://dx.doi.org/10.1016/j.jtbi.2010.04.002] [PMID: 20380840]
[7]
Chou, K.C.; Shen, H.B. Recent progress in protein subcellular location prediction. Anal. Biochem., 2007, 370(1), 1-16.
[http://dx.doi.org/10.1016/j.ab.2007.07.006] [PMID: 17698024]
[8]
Du, P.; Cao, S.; Li, Y. SubChlo: predicting protein subchloroplast locations with pseudo-amino acid composition and the evidence-theoretic K-nearest neighbor (ET-KNN) algorithm. J. Theor. Biol., 2009, 261(2), 330-335.
[http://dx.doi.org/10.1016/j.jtbi.2009.08.004] [PMID: 19679138]
[9]
Tung, C.W.; Liaw, C.; Ho, S.J. Prediction of Protein Subchloroplast Locations using Random Forests. IJBSE, 2010, 4, 336-340.
[10]
Shi, S.P.; Qiu, J.D.; Sun, X.Y.; Huang, J.H.; Huang, S.Y.; Suo, S.B.; Liang, R.P.; Zhang, L. Identify submitochondria and subchloroplast locations with pseudo amino acid composition: approach from the strategy of discrete wavelet transform feature extraction. Biochim. Biophys. Acta, 2011, 1813(3), 424-430.
[http://dx.doi.org/10.1016/j.bbamcr.2011.01.011] [PMID: 21255619]
[11]
Hu, J.; Yan, X.B.S-K.N.N. An Effective Algorithm for Predicting Protein Subchloroplast Localization. Evol. Bioinform. Online, 2012, 8, 79-87.
[http://dx.doi.org/10.4137/EBO.S8681] [PMID: 22267906]
[12]
Du, P.F.; Li, T.T.; Wang, X. SubChlo-GO: Predicting Protein Subchloroplast Locations with Weighted Gene Ontology Scores. Curr. Bioinform., 2013, 8, 193-199.
[http://dx.doi.org/10.2174/1574893611308020007]
[13]
Lin, H.; Ding, C.; Yuan, L.F. Predicting subchloroplast locations of proteins based on the general form of chou’s pseudo amino acid composition: approached from optimal tripeptide composition. Int. J. Biomath., 2013.61350003
[http://dx.doi.org/10.1142/S1793524513500034]
[14]
Saravanan, V.; Lakshmi, P.T. SCLAP: an adaptive boosting method for predicting subchloroplast localization of plant proteins. OMICS, 2013, 17(2), 106-115.
[http://dx.doi.org/10.1089/omi.2012.0070] [PMID: 23289782]
[15]
Wang, Q.; Wei, L.; Guan, X. Briefing in family characteristics of microRNAs and their applications in cancer research; BBA - Proteins and Proteomics, 1844, pp. 191-197.
[16]
Huang, C.; Yuan, J.Q. Predicting protein subchloroplast locations with both single and multiple sites via three different modes of Chou’s pseudo amino acid compositions. J. Theor. Biol., 2013, 335, 205-212.
[http://dx.doi.org/10.1016/j.jtbi.2013.06.034] [PMID: 23850480]
[17]
Wang, X.; Zhang, W.; Zhang, Q.; Li, G.Z. MultiP-SChlo: multi-label protein subchloroplast localization prediction with Chou’s pseudo amino acid composition and a novel multi-label classifier. Bioinformatics, 2015, 31(16), 2639-2645.
[http://dx.doi.org/10.1093/bioinformatics/btv212] [PMID: 25900916]
[18]
Wan, S.; Mak, M.W.; Kung, S.Y. Transductive learning for multi-label protein subchloroplast localization prediction. IEEE/ACM Trans Comput. Biol. Bioinform., 2017, 14, 212-224.
[http://dx.doi.org/10.1109/TCBB.2016.2527657]
[19]
Wan, S.; Mak, M.W.; Kung, S.Y. Ensemble Linear Neighborhood Propagation for Predicting Subchloroplast Localization of Multi-Location Proteins. J. Proteome Res., 2016, 15(12), 4755-4762.
[http://dx.doi.org/10.1021/acs.jproteome.6b00686] [PMID: 27766879]
[20]
Wang, Z.; Zou, Q.; Jiang, Y. Review of Protein Subcellular Localization Prediction. Curr. Bioinform., 2014, 9, 331-342.
[http://dx.doi.org/10.2174/1574893609666140212000304]
[21]
Chou, K.C. Some remarks on protein attribute prediction and pseudo amino acid composition. J. Theor. Biol., 2011, 273(1), 236-247.
[http://dx.doi.org/10.1016/j.jtbi.2010.12.024] [PMID: 21168420]
[22]
Cui, T.; Zhang, L.; Huang, Y.; Yi, Y.; Tan, P.; Zhao, Y.; Hu, Y.; Xu, L.; Li, E.; Wang, D. MNDR v2.0: an updated resource of ncRNA-disease associations in mammals. Nucleic Acids Res., 2018, 46(D1), D371-D374.
[PMID: 29106639]
[23]
Zhang, T.; Tan, P.; Wang, L.; Jin, N.; Li, Y.; Zhang, L.; Yang, H.; Hu, Z.; Zhang, L.; Hu, C.; Li, C.; Qian, K.; Zhang, C.; Huang, Y.; Li, K.; Lin, H.; Wang, D. RNALocate: a resource for RNA subcellular localizations. Nucleic Acids Res., 2017, 45(D1), D135-D138.
[PMID: 27543076]
[24]
Yi, Y.; Zhao, Y.; Li, C.; Zhang, L.; Huang, H.; Li, Y.; Liu, L.; Hou, P.; Cui, T.; Tan, P.; Hu, Y.; Zhang, T.; Huang, Y.; Li, X.; Yu, J.; Wang, D. RAID v2.0: an updated resource of RNA-associated interactions across organisms. Nucleic Acids Res., 2017, 45(D1), D115-D118.
[http://dx.doi.org/10.1093/nar/gkw1052] [PMID: 27899615]
[25]
Liang, Z.Y.; Lai, H.Y.; Yang, H.; Zhang, C.J.; Yang, H.; Wei, H.H.; Chen, X.X.; Zhao, Y.W.; Su, Z.D.; Li, W.C.; Deng, E.Z.; Tang, H.; Chen, W.; Lin, H. Pro54DB: a database for experimentally verified sigma-54 promoters. Bioinformatics, 2017, 33(3), 467-469.
[PMID: 28171531]
[26]
Feng, P.; Ding, H.; Lin, H.; Chen, W. AOD: the antioxidant protein database. Sci. Rep., 2017, 7(1), 7449.
[http://dx.doi.org/10.1038/s41598-017-08115-6] [PMID: 28784999]
[27]
Fu, L.; Niu, B.; Zhu, Z.; Wu, S.; Li, W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics, 2012, 28(23), 3150-3152.
[http://dx.doi.org/10.1093/bioinformatics/bts565] [PMID: 23060610]
[28]
Huang, Y.; Niu, B.; Gao, Y.; Fu, L.; Li, W. CD-HIT Suite: a web server for clustering and comparing biological sequences. Bioinformatics, 2010, 26(5), 680-682.
[http://dx.doi.org/10.1093/bioinformatics/btq003] [PMID: 20053844]
[29]
Li, W.; Godzik, A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics, 2006, 22(13), 1658-1659.
[http://dx.doi.org/10.1093/bioinformatics/btl158] [PMID: 16731699]
[30]
Wang, G.; Dunbrack, R.L. Jr PISCES: recent improvements to a PDB sequence culling server., Nucleic Acids Res., 2005, 33(Web Server issue), W94-8. http://dx.doi.org/10.1093/nar/gki402 PMID: 15980589
[31]
Chou, K.C.; Shen, H.B. Cell-PLoc: a package of Web servers for predicting subcellular localization of proteins in various organisms. Nat. Protoc., 2008, 3(2), 153-162.
[http://dx.doi.org/10.1038/nprot.2007.494] [PMID: 18274516]
[32]
Lin, H.; Chen, W. Prediction of thermophilic proteins using feature selection technique. J. Microbiol. Methods, 2011, 84(1), 67-70.
[http://dx.doi.org/10.1016/j.mimet.2010.10.013] [PMID: 21044646]
[33]
Chou, K.C. Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins, 2001, 43(3), 246-255.
[http://dx.doi.org/10.1002/prot.1035] [PMID: 11288174]
[34]
Ding, H.; Luo, L.; Lin, H. Prediction of cell wall lytic enzymes using Chou’s amphiphilic pseudo amino acid composition. Protein Pept. Lett., 2009, 16(4), 351-355.
[http://dx.doi.org/10.2174/092986609787848045] [PMID: 19356130]
[35]
Lin, H.; Ding, H.; Guo, F.B.; Zhang, A.Y.; Huang, J. Predicting subcellular localization of mycobacterial proteins by using Chou’s pseudo amino acid composition. Protein Pept. Lett., 2008, 15(7), 739-744.
[http://dx.doi.org/10.2174/092986608785133681] [PMID: 18782071]
[36]
Sarda, D.; Chua, G.H.; Li, K.B.; Krishnan, A. pSLIP: SVM based protein subcellular localization prediction using multiple physicochemical properties. BMC Bioinformatics, 2005, 6, 152.
[http://dx.doi.org/10.1186/1471-2105-6-152] [PMID: 15963230]
[37]
Huang, W.L.; Tung, C.W.; Huang, H.L.; Hwang, S.F.; Ho, S.Y. ProLoc: prediction of protein subnuclear localization using SVM with automatic selection from physicochemical composition features. Biosystems, 2007, 90(2), 573-581.
[http://dx.doi.org/10.1016/j.biosystems.2007.01.001] [PMID: 17291684]
[38]
Zou, H.L.; Xiao, X. Classifying Multifunctional Enzymes by Incorporating Three Different Models into Chou’s General Pseudo Amino Acid Composition. J. Membr. Biol., 2016, 249(4), 551-557.
[http://dx.doi.org/10.1007/s00232-016-9904-3] [PMID: 27113936]
[39]
Carr, K.; Murray, E.; Armah, E.; He, R.L.; Yau, S.S. A rapid method for characterization of protein relatedness using feature vectors. PLoS One, 2010, 5(3)e9550
[http://dx.doi.org/10.1371/journal.pone.0009550] [PMID: 20221427]
[40]
Dubchak, I.; Muchnik, I.; Holbrook, S.R.; Kim, S.H. Prediction of protein folding class using global description of amino acid sequence. Proc. Natl. Acad. Sci. USA, 1995, 92(19), 8700-8704.
[http://dx.doi.org/10.1073/pnas.92.19.8700] [PMID: 7568000]
[41]
Zou, Q.; Wang, Z.; Guan, X.; Liu, B.; Wu, Y.; Lin, Z. An approach for identifying cytokines based on a novel ensemble classifier. BioMed Res. Int., 2013, 2013686090
[http://dx.doi.org/10.1155/2013/686090] [PMID: 24027761]
[42]
Zhang, J.; Ju, Y.; Lu, H.; Xuan, P.; Zou, Q. Accurate Identification of Cancerlectins through Hybrid Machine Learning Technology. Int. J. Genomics, 2016, 20167604641
[http://dx.doi.org/10.1155/2016/7604641] [PMID: 27478823]
[43]
Altschul, S.F.; Madden, T.L.; Schäffer, A.A.; Zhang, J.; Zhang, Z.; Miller, W.; Lipman, D.J. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 1997, 25(17), 3389-3402.
[http://dx.doi.org/10.1093/nar/25.17.3389] [PMID: 9254694]
[44]
Wan, S.B.; Mak, M.W.; Kung, S.Y. Protein subcellular localization prediction based on profile alignment and gene ontology; MLSP, 2011, pp. 1-6.
[45]
Mak, M.W.; Guo, J.; Kung, S.Y. PairProSVM: protein subcellular localization based on local pairwise profile alignment and SVM. IEEE/ACM Trans. Comput. Biol. Bioinform., 2008, 5, 416-422.
[46]
Camon, E.; Magrane, M.; Barrell, D.; Lee, V.; Dimmer, E.; Maslen, J.; Binns, D.; Harte, N.; Lopez, R.; Apweiler, R. The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology. Nucleic Acids Res., 2004, 32(Database issue), D262-D266.
[http://dx.doi.org/10.1093/nar/gkh021] [PMID: 14681408]
[47]
Ding, C.; Yuan, L.F.; Guo, S.H.; Lin, H.; Chen, W. Identification of mycobacterial membrane proteins and their types using over-represented tripeptide compositions. J. Proteomics, 2012, 77, 321-328.
[http://dx.doi.org/10.1016/j.jprot.2012.09.006] [PMID: 23000219]
[48]
Nanni, L.; Brahnam, S.; Lumini, A. High performance set of PseAAC and sequence based descriptors for protein classification. J. Theor. Biol., 2010, 266(1), 1-10.
[http://dx.doi.org/10.1016/j.jtbi.2010.06.006] [PMID: 20558184]
[49]
Qiu, J.D.; Luo, S.H.; Huang, J.H.; Sun, X.Y.; Liang, R.P. Predicting subcellular location of apoptosis proteins based on wavelet transform and support vector machine. Amino Acids, 2010, 38(4), 1201-1208.
[http://dx.doi.org/10.1007/s00726-009-0331-y] [PMID: 19653066]
[50]
Yang, H.; Lv, H.; Ding, H.; Chen, W.; Lin, H. iRNA-2OM: A Sequence-Based Predictor for Identifying 2′-O-Methylation Sites in Homo sapiens. J. Comput. Biol., 2018, 25(11), 1266-1277.
[http://dx.doi.org/10.1089/cmb.2018.0004] [PMID: 30113871]
[51]
Tang, H.; Zhao, Y.W.; Zou, P.; Zhang, C.M.; Chen, R.; Huang, P.; Lin, H. HBPred: a tool to identify growth hormone-binding proteins. Int. J. Biol. Sci., 2018, 14(8), 957-964.
[http://dx.doi.org/10.7150/ijbs.24174] [PMID: 29989085]
[52]
Feng, C.Q.; Zhang, Z.Y.; Zhu, X.J. iTerm-PseKNC: a sequence-based tool for predicting bacterial transcriptional terminators. Bioinformatics, 2019, 35(9), 1469-1477.
[http://dx.doi.org/10.1093/bioinformatics/bty827] [PMID: 30247625]
[53]
Dao, F.Y.; Lv, H.; Wang, F. Identify origin of replication in Saccharomyces cerevisiae using two-step feature selection technique. Bioinformatics, 2019, 35(12), 2075-2083.
[http://dx.doi.org/10.1093/bioinformatics/bty943] [PMID: 30428009]
[54]
Zeng, X.; Yuan, S.; Huang, X. Identification of cytokine via an improved genetic algorithm. Front. Comput. Sci., 2015, 9, 643-651.
[http://dx.doi.org/10.1007/s11704-014-4089-3]
[55]
Zhu, X.J.; Feng, C.Q.; Lai, H.Y. Predicting protein structural classes for low-similarity sequences by evaluating different features. Knowl. Base. Syst., 2019, 163, 787-793.
[http://dx.doi.org/10.1016/j.knosys.2018.10.007]
[56]
Yang, H.; Qiu, W.R.; Liu, G.; Guo, F.B.; Chen, W.; Chou, K.C.; Lin, H. iRSpot-Pse6NC: Identifying recombination spots in Saccharomyces cerevisiae by incorporating hexamer composition into general PseKNC. Int. J. Biol. Sci., 2018, 14(8), 883-891.
[http://dx.doi.org/10.7150/ijbs.24616] [PMID: 29989083]
[57]
Su, Z.D.; Huang, Y.; Zhang, Z.Y.; Zhao, Y.W.; Wang, D.; Chen, W.; Chou, K.C.; Lin, H. iLoc-lncRNA: predict the subcellular location of lncRNAs by incorporating octamer composition into general PseKNC. Bioinformatics, 2018, 34(24), 4196-4204.
[http://dx.doi.org/10.1093/bioinformatics/bty508] [PMID: 29931187]
[58]
Ma, J.; Gu, H. A novel method for predicting protein subcellular localization based on pseudo amino acid composition. BMB Rep., 2010, 43(10), 670-676.
[http://dx.doi.org/10.5483/BMBRep.2010.43.10.670] [PMID: 21034529]
[59]
Huang, T.; Chen, L.; Cai, Y.D.; Chou, K.C. Classification and analysis of regulatory pathways using graph property, biochemical and physicochemical property, and functional property. PLoS One, 2011, 6(9)e25297
[http://dx.doi.org/10.1371/journal.pone.0025297] [PMID: 21980418]
[60]
Zou, Q.; Wan, S.; Ju, Y.; Tang, J.; Zeng, X. Pretata: predicting TATA binding proteins with novel features and dimensionality reduction strategy. BMC Syst. Biol., 2016, 10(Suppl. 4), 114.
[http://dx.doi.org/10.1186/s12918-016-0353-5] [PMID: 28155714]
[61]
Zou, Q.; Zeng, J.; Cao, L. A novel features ranking metric with application to scalable visual and bioinformatics data classification. Neurocomputing, 2016, 173, 346-354.
[http://dx.doi.org/10.1016/j.neucom.2014.12.123]
[62]
Cheng, J.H.; Yang, H.; Liu, M.L. Prediction of bacteriophage proteins located in the host cell using hybrid features. Chemometer Intell Lab, 2018, 180, 64-69.
[http://dx.doi.org/10.1016/j.chemolab.2018.07.006]
[63]
Yang, H.; Tang, H.; Chen, X.X.; Zhang, C.J.; Zhu, P.P.; Ding, H.; Chen, W.; Lin, H. Identification of Secretory Proteins in Mycobacterium tuberculosis Using Pseudo Amino Acid Composition. BioMed Res. Int., 2016, 20165413903
[http://dx.doi.org/10.1155/2016/5413903] [PMID: 27597968]
[64]
Chen, X.X.; Tang, H.; Li, W.C.; Wu, H.; Chen, W.; Ding, H.; Lin, H. Identification of bacterial cell wall lyases via pseudo amino acid composition. BioMed Res. Int., 2016, 20161654623
[http://dx.doi.org/10.1155/2016/1654623] [PMID: 27437396]
[65]
Tang, H.; Chen, W.; Lin, H. Identification of immunoglobulins using Chou’s pseudo amino acid composition with feature selection technique. Mol. Biosyst., 2016, 12(4), 1269-1275.
[http://dx.doi.org/10.1039/C5MB00883B] [PMID: 26883492]
[66]
Tang, H.; Su, Z.D.; Wei, H.H.; Chen, W.; Lin, H. Prediction of cell-penetrating peptides with feature selection techniques. Biochem. Biophys. Res. Commun., 2016, 477(1), 150-154.
[http://dx.doi.org/10.1016/j.bbrc.2016.06.035] [PMID: 27291150]
[67]
Manavalan, B.; Shin, T.H.; Lee, G. DHSpred: support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest. Oncotarget, 2017, 9(2), 1944-1956.
[PMID: 29416743]
[68]
Manavalan, B.; Shin, T.H.; Lee, G. PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine. Front. Microbiol., 2018, 9, 476.
[http://dx.doi.org/10.3389/fmicb.2018.00476] [PMID: 29616000]
[69]
Tang, H.; Cao, R.Z.; Wang, W. A two-step discriminated method to identify thermophilic proteins. Int. J. Biomath., 2017, 101750050
[http://dx.doi.org/10.1142/S1793524517500504]
[70]
Manavalan, B.; Lee, J. SVMQA: support-vector-machine-based protein single-model quality assessment. Bioinformatics, 2017, 33(16), 2496-2503.
[http://dx.doi.org/10.1093/bioinformatics/btx222] [PMID: 28419290]
[71]
Manavalan, B.; Basith, S.; Shin, T.H.; Choi, S.; Kim, M.O.; Lee, G. MLACP: machine-learning-based prediction of anticancer peptides. Oncotarget, 2017, 8(44), 77121-77136.
[http://dx.doi.org/10.18632/oncotarget.20365] [PMID: 29100375]
[72]
Chen, W.; Yang, H.; Feng, P.; Ding, H.; Lin, H. iDNA4mC: identifying DNA N4-methylcytosine sites based on nucleotide chemical properties. Bioinformatics, 2017, 33(22), 3518-3523.
[http://dx.doi.org/10.1093/bioinformatics/btx479] [PMID: 28961687]
[73]
Chou, K.C.; Shen, H.B. Predicting eukaryotic protein subcellular location by fusing optimized evidence-theoretic K-Nearest Neighbor classifiers. J. Proteome Res., 2006, 5(8), 1888-1897.
[http://dx.doi.org/10.1021/pr060167c] [PMID: 16889410]
[74]
Zhang, C.J.; Tang, H.; Li, W.C.; Lin, H.; Chen, W.; Chou, K.C. iOri-Human: identify human origin of replication by incorporating dinucleotide physicochemical properties into pseudo nucleotide composition. Oncotarget, 2016, 7(43), 69783-69793.
[http://dx.doi.org/10.18632/oncotarget.11975] [PMID: 27626500]
[75]
Zhao, X.; Zou, Q.; Liu, B. Exploratory Predicting Protein Folding Model with Random Forest and Hybrid Features. Curr. Proteomics, 2014, 11, 289-299.
[http://dx.doi.org/10.2174/157016461104150121115154]
[76]
Wan, S.; Duan, Y.; Zou, Q. HPSLPred: An Ensemble Multi-Label Classifier for Human Protein Subcellular Location Prediction with Imbalanced Source. Proteomics, 2017, 17(17-18)1700262
[http://dx.doi.org/10.1002/pmic.201700262] [PMID: 28776938]
[77]
Lin, C.; Chen, W.; Qiu, C. LibD3C: Ensemble classifiers with a clustering and dynamic selection strategy. Neurocomputing, 2014, 123, 424-435.
[http://dx.doi.org/10.1016/j.neucom.2013.08.004]
[78]
Tang, H.; Zou, P.; Zhang, C.; Chen, R.; Chen, W.; Lin, H. Identification of apolipoprotein using feature selection technique. Sci. Rep., 2016, 6, 30441.
[http://dx.doi.org/10.1038/srep30441] [PMID: 27443605]
[79]
Zhu, P.P.; Li, W.C.; Zhong, Z.J.; Deng, E.Z.; Ding, H.; Chen, W.; Lin, H. Predicting the subcellular localization of mycobacterial proteins by incorporating the optimal tripeptides into the general form of pseudo amino acid composition. Mol. Biosyst., 2015, 11(2), 558-563.
[http://dx.doi.org/10.1039/C4MB00645C] [PMID: 25437899]
[80]
Li, W.C.; Deng, E.Z.; Ding, H. iORI-PseKNC: A predictor for identifying origin of replication with pseudo k-tuple nucleotide composition. Chemom. Intell. Lab. Syst., 2015, 141, 100-106.
[http://dx.doi.org/10.1016/j.chemolab.2014.12.011]
[81]
Chen, W.; Feng, P.M.; Lin, H.; Chou, K.C. iSS-PseDNC: identifying splicing sites using pseudo dinucleotide composition. BioMed Res. Int., 2014, 2014623149
[http://dx.doi.org/10.1155/2014/623149] [PMID: 24967386]
[82]
Chen, W.; Feng, P.M.; Deng, E.Z.; Lin, H.; Chou, K.C. iTIS-PseTNC: a sequence-based predictor for identifying translation initiation site in human genes using pseudo trinucleotide composition. Anal. Biochem., 2014, 462, 76-83.
[http://dx.doi.org/10.1016/j.ab.2014.06.022] [PMID: 25016190]
[83]
Feng, P.M.; Lin, H.; Chen, W. Identification of antioxidants from sequence information using naïve Bayes. Comput. Math. Methods Med., 2013, 2013567529
[http://dx.doi.org/10.1155/2013/567529] [PMID: 24062796]
[84]
Feng, P.M.; Ding, H.; Chen, W.; Lin, H. Naïve Bayes classifier with feature selection to identify phage virion proteins. Comput. Math. Methods Med., 2013, 2013530696
[http://dx.doi.org/10.1155/2013/530696] [PMID: 23762187]
[85]
Feng, P.M.; Chen, W.; Lin, H.; Chou, K.C. iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition. Anal. Biochem., 2013, 442(1), 118-125.
[http://dx.doi.org/10.1016/j.ab.2013.05.024] [PMID: 23756733]
[86]
Wei, L.; Chen, H.; Su, R. M6APred-EL: A Sequence-Based Predictor for Identifying N6-methyladenosine Sites Using Ensemble Learning. Mol. Ther. Nucleic Acids, 2018, 12, 635-644.
[http://dx.doi.org/10.1016/j.omtn.2018.07.004] [PMID: 30081234]
[87]
Su, R.; Wu, H.; Xu, B. Developing a Multi-Dose Computational Model for Drug-induced Hepatotoxicity Prediction based on Toxicogenomics Data. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 2019, 16(4), 1231-1239.
[http://dx.doi.org/10.1109/TCBB.2018.2858756] [PMID: 30040651]
[88]
Wei, L.; Wan, S.; Guo, J.; Wong, K.K. A novel hierarchical selective ensemble classifier with bioinformatics application. Artif. Intell. Med., 2017, 83, 82-90.
[http://dx.doi.org/10.1016/j.artmed.2017.02.005] [PMID: 28245947]
[89]
Wei, L.; Zhou, C.; Chen, H.; Song, J.; Su, R. ACPred-FL: a sequence-based predictor using effective feature representation to improve the prediction of anti-cancer peptides. Bioinformatics, 2018, 34(23), 4007-4016.
[http://dx.doi.org/10.1093/bioinformatics/bty451] [PMID: 29868903]
[90]
Song, J.; Wang, Y.; Li, F. iProt-Sub: a comprehensive package for accurately mapping and predicting protease-specific substrates and cleavage sites. Brief. Bioinform., 2019, 20(2), 638-658.
[http://dx.doi.org/10.1093/bib/bby028] [PMID: 29897410]
[91]
Song, J.; Li, F.; Leier, A.; Marquez-Lago, T.T.; Akutsu, T.; Haffari, G.; Chou, K.C.; Webb, G.I.; Pike, R.N.; Hancock, J. PROSPERous: high-throughput prediction of substrate cleavage sites for 90 proteases with improved accuracy. Bioinformatics, 2018, 34(4), 684-687.
[http://dx.doi.org/10.1093/bioinformatics/btx670] [PMID: 29069280]
[92]
Loh, S.K.; Low, S.T.; Chai, L.E. A Review of Computational Approaches to Predict Gene Functions. Curr. Bioinform., 2018, 13, 373-386.
[http://dx.doi.org/10.2174/1574893612666171002113742]
[93]
Li, B.Q.; Zhang, Y.H.; Jin, M.L. Prediction of Protein-Peptide Interactions with a Nearest Neighbor Algorithm. Curr. Bioinform., 2018, 13, 14-24.
[http://dx.doi.org/10.2174/1574893611666160711162006]
[94]
Kang, J.; Fang, Y.; Yao, P.; Neuro, P.P. A Tool for the Prediction of Neuropeptide Precursors Based on Optimal Sequence Composition. Interdiscip. Sci., 2019, 11(1), 108-114.
[http://dx.doi.org/10.1007/s12539-018-0287-2] [PMID: 29525981]
[95]
He, W.; Jia, C.; Duan, Y.; Zou, Q. 70ProPred: a predictor for discovering sigma70 promoters based on combining multiple features. BMC Syst. Biol., 2018, 12(Suppl. 4), 44.
[http://dx.doi.org/10.1186/s12918-018-0570-1] [PMID: 29745856]
[96]
Chen, W.; Feng, P.; Liu, T. Recent advances in machine learning methods for predicting heat shock proteins. Curr. Drug Metab., 2019, 20(3), 224-228.
[http://dx.doi.org/10.2174/1389200219666181031105916] [PMID: 30378494]
[97]
Zhao, W.; Feng, Y.E. Identify Protein 8-Class Secondary Structure with Quadratic Discriminant Algorithm based on the Feature Combination. Lett. Org. Chem., 2017, 14, 625-631.
[98]
Yuan, L.Z.; Yong, E.F.; Wei, Z. Using Quadratic Discriminant Analysis to Predict Protein Secondary Structure Based on Chemical Shifts. Curr. Bioinform., 2017, 12, 52-56.
[http://dx.doi.org/10.2174/1574893611666160628074537]
[99]
Ye, J.; Chen, W.; Jin, D.C. Predicting the Types of Plant Heat Shock Proteins. Lett. Org. Chem., 2017, 14, 684-689.
[http://dx.doi.org/10.2174/1570178614666170221144023]
[100]
Tang, H.; Zhang, C.M.; Chen, R. Identification of Secretory Proteins of Malaria Parasite by Feature Selection Technique. Lett. Org. Chem., 2017, 14, 621-624.
[http://dx.doi.org/10.2174/1570178614666170329155502]
[101]
Patel, S.; Tripathi, R.; Kumari, V. DeepInteract: Deep Neural Network Based Protein-Protein Interaction Prediction Tool. Curr. Bioinform., 2017, 12, 551-557.
[http://dx.doi.org/10.2174/1574893611666160815150746]
[102]
Naseem, I.; Khan, S.; Togneri, R. ECMSRC: A Sparse Learning Approach for the Prediction of Extracellular Matrix Proteins. Curr. Bioinform., 2017, 12, 361-368.
[http://dx.doi.org/10.2174/1574893611666151215213508]
[103]
Lin, Y.Q.; Min, X.P.; Li, L.L. Using a Machine-Learning Approach to Predict Discontinuous Antibody-Specific B-Cell Epitopes. Curr. Bioinform., 2017, 12, 406-415.
[http://dx.doi.org/10.2174/1574893611666160815102521]
[104]
Lei, G.C.; Tang, J.J.; Du, P.F. Predicting S-sulfenylation Sites Using Physicochemical Properties Differences. Lett. Org. Chem., 2017, 14, 665-672.
[http://dx.doi.org/10.2174/1570178614666170421164731]
[105]
Jiang, L.M.; Liao, Z.J.; Su, R. Improved Identification of Cytokines Using Feature Selection Techniques. Lett. Org. Chem., 2017, 14, 632-641.
[http://dx.doi.org/10.2174/1570178614666170227143434]
[106]
Cao, R.; Freitas, C.; Chan, L.; Sun, M.; Jiang, H.; Chen, Z. ProLanGO: Protein Function Prediction Using Neural Machine Translation Based on a Recurrent Neural Network. Molecules, 2017, 22(10), 22.
[http://dx.doi.org/10.3390/molecules22101732] [PMID: 29039790]
[107]
Song, J.; Tan, H.; Perry, A.J.; Akutsu, T.; Webb, G.I.; Whisstock, J.C.; Pike, R.N. PROSPER: an integrated feature-based tool for predicting protease substrate cleavage sites. PLoS One, 2012, 7(11)e50300
[http://dx.doi.org/10.1371/journal.pone.0050300] [PMID: 23209700]
[108]
Song, J.; Tan, H.; Shen, H.; Mahmood, K.; Boyd, S.E.; Webb, G.I.; Akutsu, T.; Whisstock, J.C. Cascleave: towards more accurate prediction of caspase substrate cleavage sites. Bioinformatics, 2010, 26(6), 752-760.
[http://dx.doi.org/10.1093/bioinformatics/btq043] [PMID: 20130033]
[109]
Song, J.; Burrage, K. Predicting residue-wise contact orders in proteins by support vector regression. BMC Bioinformatics, 2006, 7, 425.
[http://dx.doi.org/10.1186/1471-2105-7-425] [PMID: 17014735]
[110]
Yang, W.; Zhu, X.J.; Huang, J.; Ding, H.; Lin, H. A brief survey of machine learning methods in protein sub-Golgi localization. Curr. Bioinform., 2019, 14, 234-240.
[http://dx.doi.org/10.2174/1574893613666181113131415]
[111]
Wei, L.; Su, R.; Wang, B. Integration of deep feature representations and handcrafted features to improve the prediction of N6-methyladenosine sites. Neurocomputing, 2019, 324, 3-9.
[http://dx.doi.org/10.1016/j.neucom.2018.04.082]
[112]
Wei, L.; Ding, Y.; Su, R. Prediction of human protein subcellular localization using deep learning. J. Parallel Distrib. Comput., 2018, 117, 212-217.
[http://dx.doi.org/10.1016/j.jpdc.2017.08.009]
[113]
Peng, L.; Peng, M.M.; Liao, B. The Advances and Challenges of Deep Learning Application in Biological Big Data Processing. Curr. Bioinform., 2018, 13, 352-359.
[http://dx.doi.org/10.2174/1574893612666170707095707]
[114]
Long, H.X.; Wang, M.; Fu, H.Y. Deep Convolutional Neural Networks for Predicting Hydroxyproline in Proteins. Curr. Bioinform., 2017, 12, 233-238.
[http://dx.doi.org/10.2174/1574893612666170221152848]
[115]
Cao, R.; Bhattacharya, D.; Hou, J.; Cheng, J. DeepQA: improving the estimation of single protein model quality with deep belief networks. BMC Bioinformatics, 2016, 17(1), 495.
[http://dx.doi.org/10.1186/s12859-016-1405-y] [PMID: 27919220]
[116]
Zou, Q.; Xing, P.; Wei, L. Gene2vec: Gene Subsequence Embedding for Prediction of Mammalian N6-Methyladenosine Sites from mRNA. RNA, 2019, 25(2), 205-218.
[http://dx.doi.org/10.1261/rna.069112.118] [PMID: 30425123]

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy