Identification of 2’-O-methylation Site by Investigating Multi-feature Extracting Techniques

Author(s): Qin-Lai Huang, Lida Wang, Shu-Guang Han*, Hua Tang*

Journal Name: Combinatorial Chemistry & High Throughput Screening
Accelerated Technologies for Biotechnology, Bioassays, Medicinal Chemistry and Natural Products Research

Volume 23 , Issue 6 , 2020

Become EABM
Become Reviewer
Call for Editor

Abstract:

Background: RNA methylation is a reversible post-transcriptional modification involving numerous biological processes. Ribose 2'-O-methylation is part of RNA methylation. It has shown that ribose 2'-O-methylation plays an important role in immune recognition and other pathogenesis.

Objective: We aim to design a computational method to identify 2'-O-methylation.

Methods: Different from the experimental method, we propose a computational workflow to identify the methylation site based on the multi-feature extracting algorithm.

Results: With a voting procedure based on 7 best feature-classifier combinations, we achieved Accuracy of 76.5% in 10-fold cross-validation. Furthermore, we optimized features and input the optimized features into SVM. As a result, the AUC reached to 0.813.

Conclusion: The RNA sample, especially the negative samples, used in this study are more objective and strict, so we obtained more representative results than state-of-arts studies.

Keywords: 2'-O-methylation, feature extraction, classification algorithm, vote strategy, cross-validation, feature selection.

[1]
Ayadi, L.; Galvanin, A.; Pichot, F.; Marchand, V.; Motorin, Y. RNA ribose methylation (2′-O-methylation): Occurrence, biosynthesis and biological functions. Biochim. Biophys. Acta. Gene Regul. Mech., 2019, 1862(3), 253-269.
[http://dx.doi.org/10.1016/j.bbagrm.2018.11.009] [PMID: 30572123]
[2]
Kawai G1. U.H., Yasuda M, Sakamoto K, Hashizume T, McCloskey JA, Miyazawa T, Yokoyama S, Relation between functions and conformational characteristics of modified nucleosides found in tRNAs. Nucleic Acids Symp. Ser., 1991, (25), 2.
[3]
Abe, M.; Naqvi, A.; Hendriks, G.J.; Feltzin, V.; Zhu, Y.; Grigoriev, A.; Bonini, N.M. Impact of age-associated increase in 2′-O-methylation of miRNAs on aging and neurodegeneration in Drosophila. Genes Dev., 2014, 28(1), 44-57.
[http://dx.doi.org/10.1101/gad.226654.113] [PMID: 24395246]
[4]
La Teana, A.; Gualerzi, C.O.; Dahlberg, A.E. Initiation factor IF 2 binds to the alpha-sarcin loop and helix 89 of Escherichia coli 23S ribosomal RNA. RNA, 2001, 7(8), 1173-1179.
[http://dx.doi.org/10.1017/S1355838201010366] [PMID: 11497435]
[5]
Cheng, L.; Wang, P.; Tian, R.; Wang, S.; Guo, Q.; Luo, M.; Zhou, W.; Liu, G.; Jiang, H.; Jiang, Q. LncRNA2Target v2.0: a comprehensive database for target genes of lncRNAs in human and mouse. Nucleic Acids Res., 2019, 47(D1), D140-D144.
[http://dx.doi.org/10.1093/nar/gky1051] [PMID: 30380072]
[6]
Cheng, L.; Hu, Y. Human disease system biology. Curr. Gene Ther., 2018, 18(5), 255-256.
[http://dx.doi.org/10.2174/1566523218666181010101114] [PMID: 30306867]
[7]
Chen, J. Identification of multidimensional regulatory modules through multi-graph matching with network constraints. IEEE Trans. Biomed. Eng., 2020, 64(4), 987-998.
[http://dx.doi.org/10.1109/TBME.2019.2927157]
[8]
Pintard, L.; Lecointe, F.; Bujnicki, J.M.; Bonnerot, C.; Grosjean, H.; Lapeyre, B. Trm7p catalyses the formation of two 2′-O-methylriboses in yeast tRNA anticodon loop. EMBO J., 2002, 21(7), 1811-1820.
[http://dx.doi.org/10.1093/emboj/21.7.1811] [PMID: 11927565]
[9]
Guy, M.P.; Podyma, B.M.; Preston, M.A.; Shaheen, H.H.; Krivos, K.L.; Limbach, P.A.; Hopper, A.K.; Phizicky, E.M. Yeast Trm7 interacts with distinct proteins for critical modifications of the tRNAPhe anticodon loop. RNA, 2012, 18(10), 1921-1933.
[http://dx.doi.org/10.1261/rna.035287.112] [PMID: 22912484]
[10]
Motorin, Y.; Marchand, V. Detection and analysis of RNA ribose 2′-O-Methylations: Challenges and Solutions. Genes (Basel), 2018, 9(12)E642
[http://dx.doi.org/10.3390/genes9120642] [PMID: 30567409]
[11]
Birkedal, U.; Christensen-Dalsgaard, M.; Krogh, N.; Sabarinathan, R.; Gorodkin, J.; Nielsen, H. Profiling of ribose methylations in RNA by high-throughput sequencing. Angew. Chem. Int. Ed. Engl., 2015, 54(2), 451-455.
[http://dx.doi.org/10.1002/ange.201408362] [PMID: 25417815]
[12]
Incarnato, D.; Anselmi, F.; Morandi, E.; Neri, F.; Maldotti, M.; Rapelli, S.; Parlato, C.; Basile, G.; Oliviero, S. High-throughput single-base resolution mapping of RNA 2´-O-methylated residues. Nucleic Acids Res., 2017, 45(3), 1433-1441.
[http://dx.doi.org/10.1093/nar/gkw810] [PMID: 28180324]
[13]
Tahir, M.; Tayara, H.; Chong, K.T. iRNA-PseKNC(2methyl): Identify RNA 2′-O-methylation sites by convolution neural network and Chou’s pseudo components. J. Theor. Biol., 2019, 465, 1-6.
[http://dx.doi.org/10.1016/j.jtbi.2018.12.034] [PMID: 30590059]
[14]
Chen, W.; Feng, P.; Tang, H.; Ding, H.; Lin, H. Identifying 2′-O-methylationation sites by integrating nucleotide chemical properties and nucleotide compositions. Genomics, 2016, 107(6), 255-258.
[http://dx.doi.org/10.1016/j.ygeno.2016.05.003] [PMID: 27191866]
[15]
Yang, H.; Lv, H.; Ding, H.; Chen, W.; Lin, H. iRNA-2OM: a sequence-based predictor for identifying 2′-O-methylation sites in Homo sapiens. J. Comput. Biol., 2018, 25(11), 1266-1277.
[http://dx.doi.org/10.1089/cmb.2018.0004] [PMID: 30113871]
[16]
Sun, W.J.; Li, J.H.; Liu, S.; Wu, J.; Zhou, H.; Qu, L.H.; Yang, J.H. RMBase: a resource for decoding the landscape of RNA modifications from high-throughput sequencing data. Nucleic Acids Res., 2016, 44(D1), D259-D265.
[http://dx.doi.org/10.1093/nar/gkv1036] [PMID: 26464443]
[17]
Griffiths-Jones, S.; Grocock, R.J.; van Dongen, S.; Bateman, A.; Enright, A.J. miRBase: microRNA sequences, targets and gene nomenclature. Nucleic Acids Res., 2006, 34(Database issue), D140-D144.
[http://dx.doi.org/10.1093/nar/gkj112] [PMID: 16381832]
[18]
Liu, B.; Gao, X.; Zhang, H. BioSeq-Analysis 2.0: an updated platform for analyzing DNA, RNA and protein sequences at sequence level and residue level based on machine learning approaches. Nucleic Acids Res., 2019, 47(20)e127
[http://dx.doi.org/10.1093/nar/gkz740] [PMID: 31504851]
[19]
Fu, X.; Zhu, W.; Cai, L.; Liao, B.; Peng, L.; Chen, Y.; Yang, J. Improved Pre-miRNAs identification through mutual information of Pre-miRNA sequences and structures. Front. Genet., 2019, 10, 119.
[http://dx.doi.org/10.3389/fgene.2019.00119] [PMID: 30858864]
[20]
Chen, Z.; Zhao, P.; Li, F.; Leier, A.; Marquez-Lago, T.T.; Wang, Y.; Webb, G.I.; Smith, A.I.; Daly, R.J.; Chou, K.C.; Song, J. iFeature: a Python package and web server for features extraction and selection from protein and peptide sequences. Bioinformatics, 2018, 34(14), 2499-2502.
[http://dx.doi.org/10.1093/bioinformatics/bty140] [PMID: 29528364]
[21]
Li, F. Twenty years of bioinformatics research for protease-specific substrate and cleavage site prediction: a comprehensive revisit and benchmarking of existing methods. Brief. Bioinform., 2019, 20(6), 2150-2166.
[PMID: 30184176]
[22]
Chen, Z.; Zhao, P.; Li, F.; Marquez-Lago, T.T.; Leier, A.; Revote, J.; Zhu, Y.; Powell, D.R.; Akutsu, T.; Webb, G.I.; Chou, K.C.; Smith, A.I.; Daly, R.J.; Li, J.; Song, J. iLearn: an integrated platform and meta-learner for feature engineering, machine-learning analysis and modeling of DNA, RNA and protein sequence data. Brief. Bioinform., 2020, 21(3), 1047-1057.
[http://dx.doi.org/10.1093/bib/bbz041] [PMID: 31067315]
[23]
Song, J.; Wang, Y.; Li, F.; Akutsu, T.; Rawlings, N.D.; Webb, G.I.; Chou, K.C. iProt-Sub: a comprehensive package for accurately mapping and predicting protease-specific substrates and cleavage sites. Brief. Bioinform., 2019, 20(2), 638-658.
[http://dx.doi.org/10.1093/bib/bby028] [PMID: 29897410]
[24]
Liu, B.; Liu, F.; Wang, X.; Chen, J.; Fang, L.; Chou, K.C. Pse-in-One: a web server for generating various modes of pseudo components of DNA, RNA, and protein sequences. Nucleic Acids Res., 2015, 43(W1)W65-71
[http://dx.doi.org/10.1093/nar/gkv458] [PMID: 25958395]
[25]
Feng, C.Q.; Zhang, Z.Y.; Zhu, X.J.; Lin, Y.; Chen, W.; Tang, H.; Lin, H. iTerm-PseKNC: a sequence-based tool for predicting bacterial transcriptional terminators. Bioinformatics, 2019, 35(9), 1469-1477.
[http://dx.doi.org/10.1093/bioinformatics/bty827] [PMID: 30247625]
[26]
Dao, F.Y.; Lv, H.; Wang, F.; Feng, C.Q.; Ding, H.; Chen, W.; Lin, H. Identify origin of replication in Saccharomyces cerevisiae using two-step feature selection technique. Bioinformatics, 2019, 35(12), 2075-2083.
[http://dx.doi.org/10.1093/bioinformatics/bty943] [PMID: 30428009]
[27]
Friedel, M.; Nikolajewa, S.; Sühnel, J.; Wilhelm, T. DiProDB: a database for dinucleotide properties. Nucleic Acids Res., 2009, 37(Database issue), D37-D40.
[http://dx.doi.org/10.1093/nar/gkn597] [PMID: 18805906]
[28]
Guo, Y.; Yu, L.; Wen, Z.; Li, M. Using support vector machine combined with auto covariance to predict protein-protein interactions from protein sequences. Nucleic Acids Res., 2008, 36(9), 3025-3030.
[http://dx.doi.org/10.1093/nar/gkn159] [PMID: 18390576]
[29]
Dong, Q.; Zhou, S.; Guan, J. A new taxonomy-based protein fold recognition approach based on autocross-covariance transformation. Bioinformatics, 2009, 25(20), 2655-2662.
[http://dx.doi.org/10.1093/bioinformatics/btp500] [PMID: 19706744]
[30]
Chou, K.C. Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins, 2001, 43(3), 246-255.
[http://dx.doi.org/10.1002/prot.1035] [PMID: 11288174]
[31]
Liu, B.; Liu, F.; Fang, L.; Wang, X.; Chou, K.C. repRNA: a web server for generating various feature vectors of RNA sequences. Mol. Genet. Genomics, 2016, 291(1), 473-481.
[http://dx.doi.org/10.1007/s00438-015-1078-7] [PMID: 26085220]
[32]
Wang, H.; Liu, C.; Deng, L. Enhanced prediction of hot spots at protein-protein interfaces using extreme gradient boosting. Sci. Rep., 2018, 8(1), 14285.
[http://dx.doi.org/10.1038/s41598-018-32511-1] [PMID: 30250210]
[33]
Shen, Y.; Tang, J.; Guo, F. Identification of protein subcellular localization via integrating evolutionary and physicochemical information into Chou’s general PseAAC. J. Theor. Biol., 2019, 462, 230-239.
[http://dx.doi.org/10.1016/j.jtbi.2018.11.012] [PMID: 30452958]
[34]
Liu, B.; Li, C.C.; Yan, K. DeepSVM-fold: protein fold recognition by combining support vector machines and pairwise sequence similarity scores generated by deep learning networks. Brief. Bioinform., 2019.bbz098 e-pub ahead of print
[http://dx.doi.org/10.1093/bib/bbz098] [PMID: 31665221]
[35]
Zhu, X.J. Predicting protein structural classes for low-similarity sequences by evaluating different features. Knowl. Base. Syst., 2019, 163, 787-793.
[http://dx.doi.org/10.1016/j.knosys.2018.10.007]
[36]
Mrozek, D.; Gosk, P.; Małysiak-Mrozek, B. Scaling Ab initio predictions of 3D Protein Structures in Microsoft Azure Cloud. J. Grid Comput., 2015, 13(4), 561-585.
[http://dx.doi.org/10.1007/s10723-015-9353-8]
[37]
Ding, C.; Yuan, L.F.; Guo, S.H.; Lin, H.; Chen, W. Identification of mycobacterial membrane proteins and their types using over-represented tripeptide compositions. J. Proteomics, 2012, 77, 321-328.
[http://dx.doi.org/10.1016/j.jprot.2012.09.006] [PMID: 23000219]
[38]
Yang, H.; Tang, H.; Chen, X.X.; Zhang, C.J.; Zhu, P.P.; Ding, H.; Chen, W.; Lin, H. Identification of secretory proteins in Mycobacterium tuberculosis using pseudo amino acid composition. BioMed Res. Int., 2016, 20165413903
[http://dx.doi.org/10.1155/2016/5413903] [PMID: 27597968]
[39]
Chen, W.; Feng, P.M.; Lin, H.; Chou, K.C. iSS-PseDNC: identifying splicing sites using pseudo dinucleotide composition. BioMed Res. Int., 2014, 2014623149
[http://dx.doi.org/10.1155/2014/623149] [PMID: 24967386]
[40]
Liu, B.; Weng, F.; Huang, D.S.; Chou, K.C. iRO-3wPseKNC: identify DNA replication origins by three-window-based PseKNC. Bioinformatics, 2018, 34(18), 3086-3093.
[http://dx.doi.org/10.1093/bioinformatics/bty312] [PMID: 29684124]
[41]
Lin, H. Identifying sigma70 promoters with novel pseudo nucleotide composition. IEEE/ACM Trans. Comput. Biol. Bioinform., 2019, 16, 1316-1321.
[http://dx.doi.org/10.1109/TCBB.2017.2666141]
[42]
Lai, H.Y.; Zhang, Z.Y.; Su, Z.D.; Su, W.; Ding, H.; Chen, W.; Lin, H. iProEP: a computational predictor for predicting promoter. Mol. Ther. Nucleic Acids, 2019, 17, 337-346.
[http://dx.doi.org/10.1016/j.omtn.2019.05.028] [PMID: 31299595]
[43]
Su, Z.D.; Huang, Y.; Zhang, Z.Y.; Zhao, Y.W.; Wang, D.; Chen, W.; Chou, K.C.; Lin, H. iLoc-lncRNA: predict the subcellular location of lncRNAs by incorporating octamer composition into general PseKNC. Bioinformatics, 2018, 34(24), 4196-4204.
[http://dx.doi.org/10.1093/bioinformatics/bty508] [PMID: 29931187]
[44]
Xu, Z.C.; Feng, P.M.; Yang, H.; Qiu, W.R.; Chen, W.; Lin, H. iRNAD: a computational tool for identifying D modification sites in RNA sequence. Bioinformatics, 2019, 35(23), 4922-4929.
[http://dx.doi.org/10.1093/bioinformatics/btz358] [PMID: 31077296]
[45]
Wu, Z. A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst., 2020. [epub ahead of print]. http://dx.doi.org/1901.00596
[PMID: 32217482]
[46]
Liu, B.; Zhu, Y. ProtDec-LTR3.0: protein remote homology detection by incorporating profile-based features into Learning to Rank. IEEE Access, 2019, 7, 102499-102507.
[http://dx.doi.org/10.1109/ACCESS.2019.2929363]
[47]
Liu, B.; Li, K.; Huang, D.S.; Chou, K.C. iEnhancer-EL: identifying enhancers and their strength with ensemble learning approach. Bioinformatics, 2018, 34(22), 3835-3842.
[http://dx.doi.org/10.1093/bioinformatics/bty458] [PMID: 29878118]
[48]
Xu, L.; Liang, G.; Liao, C.; Chen, G.D.; Chang, C.C. k-Skip-n-Gram-RF: a random forest based method for alzheimer’s disease protein identification. Front. Genet., 2019, 10(33), 33.
[http://dx.doi.org/10.3389/fgene.2019.00033] [PMID: 30809242]
[49]
Zeng, X.; Wang, W.; Chen, C.; Yen, G.G. A consensus community-based particle swarm optimization for dynamic community detection. IEEE Trans. Cybern., 2020, 50(6), 2502-2513.
[http://dx.doi.org/10.1109/TCYB.2019.2938895] [PMID: 31545758]
[50]
Wang, X. A classification method for microarrays based on diversity. Curr. Bioinform., 2016, 11(5), 590-597.
[http://dx.doi.org/10.2174/1574893609666140820224436]
[51]
Feng, P.M.; Lin, H.; Chen, W. Identification of antioxidants from sequence information using naïve Bayes. Comput. Math. Methods Med., 2013, 2013567529
[http://dx.doi.org/10.1155/2013/567529] [PMID: 24062796]
[52]
Feng, P.M.; Ding, H.; Chen, W.; Lin, H. Naïve Bayes classifier with feature selection to identify phage virion proteins. Comput. Math. Methods Med., 2013, 2013530696
[http://dx.doi.org/10.1155/2013/530696] [PMID: 23762187]
[53]
Li, F. PAnDE: Averaged n-dependence estimators for positive unlabeled learning. ICIC Express Lett. Part B Appl., 2017, 8(9), 1287-1297.
[54]
Li, F.; Li, C.; Marquez-Lago, T.T.; Leier, A.; Akutsu, T.; Purcell, A.W.; Ian Smith, A.; Lithgow, T.; Daly, R.J.; Song, J.; Chou, K.C. Quokka: a comprehensive tool for rapid and accurate prediction of kinase family-specific phosphorylation sites in the human proteome. Bioinformatics, 2018, 34(24), 4223-4231.
[http://dx.doi.org/10.1093/bioinformatics/bty522] [PMID: 29947803]
[55]
Song, J.; Li, F.; Leier, A.; Marquez-Lago, T.T.; Akutsu, T.; Haffari, G.; Chou, K.C.; Webb, G.I.; Pike, R.N.; Hancock, J. PROSPERous: high-throughput prediction of substrate cleavage sites for 90 proteases with improved accuracy. Bioinformatics, 2018, 34(4), 684-687.
[http://dx.doi.org/10.1093/bioinformatics/btx670] [PMID: 29069280]
[56]
Liao, Z.J. Cancer diagnosis through isomir expression with machine learning method. Curr. Bioinform., 2018, 13(1), 57-63.
[http://dx.doi.org/10.2174/1574893611666160609081155]
[57]
Ru, X.; Li, L.; Zou, Q. Incorporating distance-based top-n-gram and random forest to identify electron transport proteins. J. Proteome Res., 2019, 18(7), 2931-2939.
[http://dx.doi.org/10.1021/acs.jproteome.9b00250] [PMID: 31136183]
[58]
Lv, Z.; Jin, S.; Ding, H.; Zou, Q. A random forest sub-Golgi protein classifier optimized via dipeptide and amino acid composition features. Front. Bioeng. Biotechnol., 2019, 7, 215.
[http://dx.doi.org/10.3389/fbioe.2019.00215] [PMID: 31552241]
[59]
Li, F.; Li, C.; Wang, M.; Webb, G.I.; Zhang, Y.; Whisstock, J.C.; Song, J. GlycoMine: a machine learning-based approach for predicting N-, C- and O-linked glycosylation in the human proteome. Bioinformatics, 2015, 31(9), 1411-1419.
[http://dx.doi.org/10.1093/bioinformatics/btu852] [PMID: 25568279]
[60]
Song, J.; Li, F.; Takemoto, K.; Haffari, G.; Akutsu, T.; Chou, K.C.; Webb, G.I. PREvaIL, an integrative approach for inferring catalytic residues using sequence, structural, and network features in a machine-learning framework. J. Theor. Biol., 2018, 443, 125-137.
[http://dx.doi.org/10.1016/j.jtbi.2018.01.023] [PMID: 29408627]
[61]
Jia, C.; Zuo, Y.; Zou, Q. O-GlcNAcPRED-II: an integrated classification algorithm for identifying O-GlcNAcylation sites based on fuzzy undersampling and a K-means PCA oversampling technique. Bioinformatics, 2018, 34(12), 2029-2036.
[http://dx.doi.org/10.1093/bioinformatics/bty039] [PMID: 29420699]
[62]
Zhang, J.; Ju, Y.; Lu, H.; Xuan, P.; Zou, Q. Accurate identification of cancerlectins through hybrid machine learning technology. Int. J. Genomics, 2016, 20167604641
[http://dx.doi.org/10.1155/2016/7604641] [PMID: 27478823]
[63]
Deng, L.; Pan, J.; Xu, X.; Yang, W.; Liu, C.; Liu, H. PDRLGB: precise DNA-binding residue prediction using a light gradient boosting machine. BMC Bioinformatics, 2018, 19(Suppl. 19), 522.
[http://dx.doi.org/10.1186/s12859-018-2527-1] [PMID: 30598073]
[64]
Xu, L.; Liang, G.; Wang, L.; Liao, C. A novel hybrid sequence-based model for identifying anticancer peptides. Genes (Basel), 2018, 9(3), 158.
[http://dx.doi.org/10.3390/genes9030158] [PMID: 29534013]
[65]
Cheng, L.; Jiang, Y.; Ju, H.; Sun, J.; Peng, J.; Zhou, M.; Hu, Y. InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk. BMC Genomics, 2018, 19(Suppl. 1), 919.
[http://dx.doi.org/10.1186/s12864-017-4338-6] [PMID: 29363423]
[66]
Cheng, L.; Hu, Y.; Sun, J.; Zhou, M.; Jiang, Q. DincRNA: a comprehensive web-based bioinformatics toolkit for exploring disease associations and ncRNA function. Bioinformatics, 2018, 34(11), 1953-1956.
[http://dx.doi.org/10.1093/bioinformatics/bty002] [PMID: 29365045]
[67]
Zhang, Z. Pathologist-level interpretable whole-slide cancer diagnosis with deep learning. Nat. Mach. Intell., 2019, 1(5), 236-245.
[http://dx.doi.org/10.1038/s42256-019-0052-1]
[68]
Wei, L.; Xing, P.; Zeng, J.; Chen, J.; Su, R.; Guo, F. Improved prediction of protein-protein interactions using novel negative samples, features, and an ensemble classifier. Artif. Intell. Med., 2017, 83, 67-74.
[http://dx.doi.org/10.1016/j.artmed.2017.03.001] [PMID: 28320624]
[69]
Wei, L.; Wan, S.; Guo, J.; Wong, K.K. A novel hierarchical selective ensemble classifier with bioinformatics application. Artif. Intell. Med., 2017, 83, 82-90.
[http://dx.doi.org/10.1016/j.artmed.2017.02.005] [PMID: 28245947]
[70]
Zhang, Z.; Zhang, J.; Fan, C.; Tang, Y.; Deng, L. KATZLGO: Large-Scale Prediction of LncRNA Functions by Using the KATZ Measure Based on Multiple Networks. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 2019, 16(2), 407-416.
[http://dx.doi.org/10.1109/TCBB.2017.2704587] [PMID: 28534780]
[71]
Deng, L.; Sui, Y.; Zhang, J. XGBPRH: prediction of binding hot spots at Protein–RNA interfaces utilizing extreme gradient boosting. Genes (Basel), 2019, 10(3), 242.
[http://dx.doi.org/10.3390/genes10030242]
[72]
Zheng, N.; Wang, K.; Zhan, W.; Deng, L. Targeting virus-host protein interactions: feature extraction and machine learning approaches. Curr. Drug Metab., 2019, 20(3), 177-184.
[http://dx.doi.org/10.2174/1389200219666180829121038] [PMID: 30156155]
[73]
Ding, Y.; Tang, J.; Guo, F. Identification of drug-side effect association via multiple information integration with centered kernel alignment. Neurocomputing, 2019, 325, 211-224.
[http://dx.doi.org/10.1016/j.neucom.2018.10.028]
[74]
Ding, Y.; Tang, J.; Guo, F. Identification of protein-ligand binding sites by sequence information and ensemble classifier. J. Chem. Inf. Model., 2017, 57(12), 3149-3161.
[http://dx.doi.org/10.1021/acs.jcim.7b00307] [PMID: 29125297]
[75]
Ding, Y.; Tang, J.; Guo, F. Identification of drug-target interactions via multiple information integration. Inf. Sci., 2017, 418-419, 546-560.
[http://dx.doi.org/10.1016/j.ins.2017.08.045]
[76]
Ding, Y.; Tang, J.; Guo, F. Predicting protein-protein interactions via multivariate mutual information of protein sequences. BMC Bioinformatics, 2016, 17(1), 398.
[http://dx.doi.org/10.1186/s12859-016-1253-9] [PMID: 27677692]
[77]
Liu, B.; Li, K. iPromoter-2L2.0: identifying promoters and their types by combining Smoothing Cutting Window algorithm and sequence-based features. Mol. Ther. Nucleic Acids, 2019, 18, 80-87.
[http://dx.doi.org/10.1016/j.omtn.2019.08.008] [PMID: 31536883]
[78]
Liu, B.; Chen, S.; Yan, K.; Weng, F. iRO-PsekGCC: identify DNA replication origins based on Pseudo k-tuple GC Composition. Front. Genet., 2019, 10, 842.
[http://dx.doi.org/10.3389/fgene.2019.00842] [PMID: 31620165]
[79]
Zeng, X.; Zhu, S.; Liu, X.; Zhou, Y.; Nussinov, R.; Cheng, F. deepDR: a network-based deep learning approach to in silico drug repositioning. Bioinformatics, 2019, 35(24), 5191-5198.
[http://dx.doi.org/10.1093/bioinformatics/btz418]
[80]
Zhang, X.; Zou, Q.; Rodriguez-Paton, A.; Zeng, X. Meta-path methods for prioritizing candidate disease miRNAs. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 2019, 16(1), 283-291.
[http://dx.doi.org/10.1109/TCBB.2017.2776280] [PMID: 29990255]
[81]
Jimenez, Z.B. Matrix representation and simulation algorithm of spiking neural P systems with structural plasticity. J. Membr. Comput., 2019, 1(3), 145-160.
[http://dx.doi.org/10.1007/s41965-019-00020-3]
[82]
Chen, Z.; Zhao, P.; Li, F.; Wang, Y.; Smith, A.I.; Webb, G.I.; Akutsu, T.; Baggag, A.; Bensmail, H.; Song, J. Comprehensive review and assessment of computational methods for predicting RNA post-transcriptional modification sites from RNA sequences. Brief. Bioinform., 2019.bbz112 [e-pub ahead of print
[http://dx.doi.org/10.1093/bib/bbz112] [PMID: 31714956]
[83]
Jia, C. Formator: predicting lysine formylation sites based on the most distant undersampling and safe-level synthetic minority oversampling. IEEE/ACM Trans. Comput. Biol. Bioinform., 2019.[e-pub ahead of print].
[http://dx.doi.org/10.1109/TCBB.2019.2957758] [PMID: 31804942]
[84]
Li, F.; Zhang, Y.; Purcell, A.W.; Webb, G.I.; Chou, K.C.; Lithgow, T.; Li, C.; Song, J. Positive-unlabelled learning of glycosylation sites in the human proteome. BMC Bioinformatics, 2019, 20(1), 112.
[http://dx.doi.org/10.1186/s12859-019-2700-1] [PMID: 30841845]
[85]
Zhang, M.; Li, F.; Marquez-Lago, T.T.; Leier, A.; Fan, C.; Kwoh, C.K.; Chou, K.C.; Song, J.; Jia, C. MULTiPly: a novel multi-layer predictor for discovering general and specific types of promoters. Bioinformatics, 2019, 35(17), 2957-2965.
[http://dx.doi.org/10.1093/bioinformatics/btz016] [PMID: 30649179]
[86]
Li, F. DeepCleave: a deep learning predictor for caspase and matrix metalloprotease substrates and cleavage sites. Bioinformatics, 2020, 36(4), 1057-1065.
[http://dx.doi.org/10.1093/bioinformatics/btz721] [PMID: 31566664]
[87]
Mei, S.; Li, F.; Leier, A.; Marquez-Lago, T.T.; Giam, K.; Croft, N.P.; Akutsu, T.; Smith, A.I.; Li, J.; Rossjohn, J.; Purcell, A.W.; Song, J. A comprehensive review and performance evaluation of bioinformatics tools for HLA class I peptide-binding prediction. Brief. Bioinform., 2019.bbz051 e-pub ahead of print
[http://dx.doi.org/10.1093/bib/bbz051] [PMID: 31204427]
[88]
Tian, F. PlantRegMap: charting functional regulatory maps in plants. Nucleic Acids Res., 2020, 48(D1), D1104-D1113.
[http://dx.doi.org/10.1093/nar/gkz1020] [PMID: 31701126]
[89]
Tan, J.X.; Lv, H.; Wang, F.; Dao, F.Y.; Chen, W.; Ding, H. A survey for predicting enzyme family classes using machine learning methods. Curr. Drug Targets, 2019, 20(5), 540-550.
[http://dx.doi.org/10.2174/1389450119666181002143355] [PMID: 30277150]
[90]
Ru, X.; Cao, P.; Li, L.; Zou, Q. Selecting essential micrornas using a novel voting method. Mol. Ther. Nucleic Acids, 2019, 18, 16-23.
[http://dx.doi.org/10.1016/j.omtn.2019.07.019] [PMID: 31479921]
[91]
Zhu, P.F. Co-regularized unsupervised feature selection. Neurocomputing, 2018, 275, 2855-2863.
[http://dx.doi.org/10.1016/j.neucom.2017.11.061]
[92]
Zhu, L.; Liu, X.; Pu, W.; Peng, Y. tRNA-derived small non-coding RNAs in human disease. Cancer Lett., 2018, 419, 1-7.
[http://dx.doi.org/10.1016/j.canlet.2018.01.015] [PMID: 29337107]
[93]
Zhu, P.F. Subspace clustering guided unsupervised feature selection. Pattern Recognit., 2017, 66, 364-374.
[http://dx.doi.org/10.1016/j.patcog.2017.01.016]
[94]
Huang, Y.; Liu, N.; Wang, J.P.; Wang, Y.Q.; Yu, X.L.; Wang, Z.B.; Cheng, X.C.; Zou, Q. Regulatory long non-coding RNA and its functions. J. Physiol. Biochem., 2012, 68(4), 611-618.
[http://dx.doi.org/10.1007/s13105-012-0166-y] [PMID: 22535282]
[95]
Su, R.; Liu, X.; Xiao, G.; Wei, L. Meta-GDBP: a high-level stacked regression model to improve anticancer drug response prediction. Brief. Bioinform., 2020, 21(3), 996-1005.
[http://dx.doi.org/10.1093/bib/bbz022] [PMID: 30868164]
[96]
Xu, L.; Liang, G.; Shi, S.; Liao, C. SeqSVM: a sequence-based support vector machine method for identifying antioxidant proteins. Int. J. Mol. Sci., 2018, 19(6)E1773
[http://dx.doi.org/10.3390/ijms19061773] [PMID: 29914044]
[97]
Song, T.; Zeng, X.; Zheng, P.; Jiang, M.; Rodriguez-Paton, A. A parallel workflow pattern modeling using spiking neural p systems with colored spikes. IEEE Trans. Nanobioscience, 2018, 17(4), 474-484.
[http://dx.doi.org/10.1109/TNB.2018.2873221] [PMID: 30281471]
[98]
Zou, Q. A novel features ranking metric with application to scalable visual and bioinformatics data classification. Neurocomputing, 2016, 173, 346-354.
[http://dx.doi.org/10.1016/j.neucom.2014.12.123]
[99]
Zeng, X.; Zhang, X.; Song, T.; Pan, L. Spiking neural P systems with thresholds. Neural Comput., 2014, 26(7), 1340-1361.
[http://dx.doi.org/10.1162/NECO_a_00605] [PMID: 24708366]
[100]
Xu, H.; Zeng, W.; Zhang, D.; Zeng, X. MOEA/HD: a multiobjective evolutionary algorithm based on hierarchical decomposition. IEEE Trans. Cybern., 2019, 49(2), 517-526.
[http://dx.doi.org/10.1109/TCYB.2017.2779450] [PMID: 29990272]
[101]
Xu, H.; Zeng, W.; Zeng, X.; Yen, G.G. An evolutionary algorithm based on minkowski distance for many-objective optimization. IEEE Trans. Cybern., 2019, 49(11), 3968-3979.
[http://dx.doi.org/10.1109/TCYB.2018.2856208] [PMID: 30059330]
[102]
Yang, J.; Huang, T.; Petralia, F.; Long, Q.; Zhang, B.; Argmann, C.; Zhao, Y.; Mobbs, C.V.; Schadt, E.E.; Zhu, J.; Tu, Z. GTEx consortium. Synchronized age-related gene expression changes across multiple tissues in human and the link to complex diseases. Sci. Rep., 2015, 5, 15145.
[http://dx.doi.org/10.1038/srep15145] [PMID: 26477495]
[103]
Małysiak-Mrozek, B.; Baron, T.; Mrozek, D. Spark-IDPP: high-throughput and scalable prediction of intrinsically disordered protein regions with Spark clusters on the Cloud. Cluster Comput., 2018, (17), 487-508.
[104]
Wei, L.; Zou, Q.; Liao, M.; Lu, H.; Zhao, Y. A novel machine learning method for cytokine-receptor interaction prediction. Comb. Chem. High Throughput Screen., 2016, 19(2), 144-152.
[http://dx.doi.org/10.2174/1386207319666151110122621] [PMID: 26552440]
[105]
Tabl, A.A.; Alkhateeb, A.; ElMaraghy, W.; Rueda, L.; Ngom, A. A machine learning approach for identifying gene biomarkers guiding the treatment of breast cancer. Front. Genet., 2019, 10, 256.
[http://dx.doi.org/10.3389/fgene.2019.00256] [PMID: 30972106]
[106]
Qiu, W.R.; Jiang, S.Y.; Sun, B.Q.; Xiao, X.; Cheng, X.; Chou, K.C. iRNA-2methyl: Identify RNA 2′-O-methylation sites by incorporating sequence-coupled effects into general PseKNC and ensemble classifier. Med. Chem., 2017, 13(8), 734-743.
[http://dx.doi.org/10.2174/1573406413666170623082245] [PMID: 28641529]
[107]
He, C.C.; Hamlow, L.A.; Zhu, Y.; Nei, Y.W.; Fan, L.; McNary, C.P.; Maître, P.; Steinmetz, V.; Schindler, B.; Compagnon, I.; Armentrout, P.B.; Rodgers, M.T. Structural and energetic effects of o2′-ribose methylation of protonated pyrimidine nucleosides. J. Am. Soc. Mass Spectrom., 2019, 30(11), 2318-2334.
[http://dx.doi.org/10.1007/s13361-019-02300-9] [PMID: 31435890]
[108]
Zou, Q.; Xing, P.; Wei, L.; Liu, B. Gene2vec: gene subsequence embedding for prediction of mammalian N6-methyladenosine sites from mRNA. RNA, 2019, 25(2), 205-218.
[http://dx.doi.org/10.1261/rna.069112.118] [PMID: 30425123]
[109]
Zhang, Z.; Zhao, Y.; Liao, X.; Shi, W.; Li, K.; Zou, Q.; Peng, S. Deep learning in omics: a survey and guideline. Brief. Funct. Genomics, 2019, 18(1), 41-57.
[http://dx.doi.org/10.1093/bfgp/ely030] [PMID: 30265280]
[110]
Peng, L. The advances and challenges of deep learning application in biological big data processing. Curr. Bioinform., 2018, 13(4), 352-359.
[http://dx.doi.org/10.2174/1574893612666170707095707]
[111]
Lv, Z.; Ao, C.; Zou, Q. Protein function prediction: from traditional classifier to deep learning. Proteomics, 2019, 19(14)e1900119
[http://dx.doi.org/10.1002/pmic.201900119] [PMID: 31187588]


Rights & PermissionsPrintExport Cite as

Article Details

VOLUME: 23
ISSUE: 6
Year: 2020
Published on: 05 October, 2020
Page: [527 - 535]
Pages: 9
DOI: 10.2174/1386207323666200425210609
Price: $65

Article Metrics

PDF: 15
HTML: 2
EPUB: 1
PRC: 1