Analysis and Comparison of RNA Pseudouridine Site Prediction Tools

Author(s): Wei Chen*, Kewei Liu

Journal Name: Current Bioinformatics

Volume 15 , Issue 4 , 2020

Become EABM
Become Reviewer

Graphical Abstract:


Abstract:

Background: Pseudouridine (Ψ) is the most abundant RNA modification and has important functions in a series of biological and cellular processes. Although experimental techniques have made great contributions to identify Ψ sites, they are still labor-intensive and costineffective. In the past few years, a series of computational approaches have been developed, which provided rapid and efficient approaches to identify Ψ sites.

Results: To provide the readership with a clear landscape about the recent development in this important area, in this review, we summarized and compared the representative computational approaches developed for identifying Ψ sites. Moreover, future directions in computationally identifying Ψ sites were discussed as well.

Conclusion: We anticipate that this review will provide novel insights into the researches on pseudouridine modification.

Keywords: Epitranscriptome, RNA modification, pseudouridine, support vector machine, nucleotide physicochemical property, web server.

[1]
Davis FF, Allen FW. Ribonucleic acids from yeast which contain a fifth nucleotide. J Biol Chem 1957; 227(2): 907-15.
[PMID: 13463012]
[2]
Sloan KE, Warda AS, Sharma S, Entian KD, Lafontaine DLJ, Bohnsack MT. Tuning the ribosome: The influence of rRNA modification on eukaryotic ribosome biogenesis and function. RNA Biol 2017; 14(9): 1138-52.
[http://dx.doi.org/10.1080/15476286.2016.1259781] [PMID: 27911188]
[3]
Ge J, Yu YT. RNA pseudouridylation: new insights into an old modification. Trends Biochem Sci 2013; 38(4): 210-8.
[http://dx.doi.org/10.1016/j.tibs.2013.01.002] [PMID: 23391857]
[4]
Wolin SL. Two for the price of one: RNA modification enzymes as chaperones. Proc Natl Acad Sci USA 2016; 113(50): 14176-8.
[http://dx.doi.org/10.1073/pnas.1617402113] [PMID: 27911836]
[5]
Kiss T, Fayet-Lebaron E, Jády BE. Box H/ACA small ribonucleoproteins. Mol Cell 2010; 37(5): 597-606.
[http://dx.doi.org/10.1016/j.molcel.2010.01.032] [PMID: 20227365]
[6]
Kiss AM, Jády BE, Bertrand E, Kiss T. Human box H/ACA pseudouridylation guide RNA machinery. Mol Cell Biol 2004; 24(13): 5797-807.
[http://dx.doi.org/10.1128/MCB.24.13.5797-5807.2004] [PMID: 15199136]
[7]
Charette M, Gray MW. Pseudouridine in RNA: what, where, how, and why. IUBMB Life 2000; 49(5): 341-51.
[http://dx.doi.org/10.1080/152165400410182] [PMID: 10902565]
[8]
Schwartz S, Bernstein DA, Mumbach MR, et al. Transcriptome-wide mapping reveals widespread dynamic-regulated pseudouridylation of ncRNA and mRNA. Cell 2014; 159(1): 148-62.
[http://dx.doi.org/10.1016/j.cell.2014.08.028] [PMID: 25219674]
[9]
Rintala-Dempsey AC, Kothe U. Eukaryotic stand-alone pseudouridine synthases - RNA modifying enzymes and emerging regulators of gene expression? RNA Biol 2017; 14(9): 1185-96.
[http://dx.doi.org/10.1080/15476286.2016.1276150] [PMID: 28045575]
[10]
Vaidyanathan PP, AlSadhan I, Merriman DK, Al-Hashimi HM, Herschlag D. Pseudouridine and N6-methyladenosine modifications weaken PUF protein/RNA interactions. RNA 2017; 23(5): 611-8.
[http://dx.doi.org/10.1261/rna.060053.116] [PMID: 28138061]
[11]
Zhou KI, Clark WC, Pan DW, Eckwahl MJ, Dai Q, Pan T. Pseudouridines have context-dependent mutation and stop rates in high-throughput sequencing. RNA Biol 2018; 15(7): 892-900.
[http://dx.doi.org/10.1080/15476286.2018.1462654] [PMID: 29683381]
[12]
Davis DR, Veltri CA, Nielsen L. An RNA model system for investigation of pseudouridine stabilization of the codon-anticodon interaction in tRNALys, tRNAHis and tRNATyr. J Biomol Struct Dyn 1998; 15(6): 1121-32.
[http://dx.doi.org/10.1080/07391102.1998.10509006] [PMID: 9669557]
[13]
Spenkuch F, Motorin Y, Helm M. Pseudouridine: still mysterious, but never a fake (uridine)! RNA Biol 2014; 11(12): 1540-54.
[http://dx.doi.org/10.4161/15476286.2014.992278] [PMID: 25616362]
[14]
Basak A, Query CC. A pseudouridine residue in the spliceosome core is part of the filamentous growth program in yeast. Cell Rep 2014; 8(4): 966-73.
[http://dx.doi.org/10.1016/j.celrep.2014.07.004] [PMID: 25127136]
[15]
Karijolich J, Yu YT. The new era of RNA modification. RNA 2015; 21(4): 659-60.
[http://dx.doi.org/10.1261/rna.049650.115] [PMID: 25780180]
[16]
Penzo M, Guerrieri AN, Zacchini F, Treré D, Montanaro L. RNA pseudouridylation in physiology and medicine: for better and for worse. Genes 2017; 8(11) E301
[http://dx.doi.org/10.3390/genes8110301] [PMID: 29104216]
[17]
Fedorov NA, Bogomazov MJ. Urinary excretion of purine bases and pseudouridine normal human and in cancer patients before and after radiotherapy. Radiobiol Radiother 1969; 10(5): 605-8.
[PMID: 5362809]
[18]
Waalkes TP, Dinsmore SR, Mrochek JE. Urinary excretion by cancer patients of the nucleosides N-dimethylguanosine, 1-methylinosine, and pseudouridine. J Natl Cancer Inst 1973; 51(1): 271-4.
[http://dx.doi.org/10.1093/jnci/51.1.271] [PMID: 4720877]
[19]
Wu G, Xiao M, Yang C, Yu YT. U2 snRNA is inducibly pseudouridylated at novel sites by Pus7p and snR81 RNP. EMBO J 2011; 30(1): 79-89.
[http://dx.doi.org/10.1038/emboj.2010.316] [PMID: 21131909]
[20]
Zhao Y, Karijolich J, Glaunsinger B, Zhou Q. Pseudouridylation of 7SK snRNA promotes 7SK snRNP formation to suppress HIV-1 transcription and escape from latency. EMBO Rep 2016; 17(10): 1441-51.
[http://dx.doi.org/10.15252/embr.201642682] [PMID: 27558685]
[21]
Wang M, Liu H, Zheng J, et al. A deafness- and diabetes-associated tRNA mutation causes deficient pseudouridinylation at position 55 in tRNAGlu and mitochondrial dysfunction. J Biol Chem 2016; 291(40): 21029-41.
[http://dx.doi.org/10.1074/jbc.M116.739482] [PMID: 27519417]
[22]
Lovejoy AF, Riordan DP, Brown PO. Transcriptome-wide mapping of pseudouridines: pseudouridine synthases modify specific mRNAs in S. cerevisiae. PLoS One 2014; 9(10) e110799
[http://dx.doi.org/10.1371/journal.pone.0110799] [PMID: 25353621]
[23]
Li X, Zhu P, Ma S, et al. Chemical pulldown reveals dynamic pseudouridylation of the mammalian transcriptome. Nat Chem Biol 2015; 11(8): 592-7.
[http://dx.doi.org/10.1038/nchembio.1836] [PMID: 26075521]
[24]
Carlile TM, Rojas-Duran MF, Zinshteyn B, Shin H, Bartoli KM, Gilbert WV. Pseudouridine profiling reveals regulated mRNA pseudouridylation in yeast and human cells. Nature 2014; 515(7525): 143-6.
[http://dx.doi.org/10.1038/nature13802] [PMID: 25192136]
[25]
Panwar B, Raghava GP. Prediction of uridine modifications in tRNA sequences. BMC Bioinformatics 2014; 15: 326.
[http://dx.doi.org/10.1186/1471-2105-15-326] [PMID: 25272949]
[26]
Li YH, Zhang G, Cui Q. PPUS: a web server to predict PUS-specific pseudouridine sites. Bioinformatics 2015; 31(20): 3362-4.
[http://dx.doi.org/10.1093/bioinformatics/btv366] [PMID: 26076723]
[27]
Chen W, Tang H, Ye J, Lin H, Chou KC. iRNA-PseU: Identifying RNA pseudouridine sites. Mol Ther Nucleic Acids 2016; 5 e332
[28]
He J, Fang T, Zhang Z, Huang B, Zhu X, Xiong Y. PseUI: Pseudouridine sites identification based on RNA sequence information. BMC Bioinformatics 2018; 19(1): 306.
[http://dx.doi.org/10.1186/s12859-018-2321-0] [PMID: 30157750]
[29]
Tahir M, Tayara H, Chong KT. ipseu-cnnl: identifying RNA pseudouridine sites using convolutional neural networks. Mol Ther Nucleic Acid 2019.
[http://dx.doi.org/10.1016/j.omtn.2019.03.010]
[30]
Xuan JJ, Sun WJ, Lin PH, et al. RMBase v2.0: deciphering the map of RNA modifications from epitranscriptome sequencing data. Nucleic Acids Res 2018; 46(D1): D327-34.
[http://dx.doi.org/10.1093/nar/gkx934] [PMID: 29040692]
[31]
Zou Q, Xing P, Wei L, Liu B. Gene2vec: gene subsequence embedding for prediction of mammalian N6-methyladenosine sites from mRNA. RNA 2019; 25(2): 205-18.
[http://dx.doi.org/10.1261/rna.069112.118] [PMID: 30425123]
[32]
Chen W, Lv H, Nie F, Lin H. i6mA-Pred: identifying DNA N6-methyladenine sites in the rice genome. Bioinformatics 2019; 35(16): 2796-800.
[http://dx.doi.org/10.1093/bioinformatics/btz015] [PMID: 30624619]
[33]
Chen W, Yang H, Feng P, Ding H, Lin H. iDNA4mC: identifying DNA N4-methylcytosine sites based on nucleotide chemical properties. Bioinformatics 2017; 33(22): 3518-23.
[http://dx.doi.org/10.1093/bioinformatics/btx479] [PMID: 28961687]
[34]
Lv H, Zhang ZM, Li SH, Tan JX, Chen W, Lin H. Evaluation of different computational methods on 5-methylcytosine sites identification. Brief Bioinform 2019. pii: bbz048
[PMID: 31157855]
[35]
Yang H, Lv H, Ding H, Chen W, Lin H. iRNA-2OM: a sequence-based predictor for identifying 2′-o-methylation sites in homo sapiens. J Comput Biol 2018; 25(11): 1266-77.
[36]
Chen W, Ding H, Zhou X, Lin H, Chou KC. iRNA(m6A)-PseDNC: Identifying N6-methyladenosine sites using pseudo dinucleotide composition. Anal Biochem 2018; 561-562: 59-65.
[http://dx.doi.org/10.1016/j.ab.2018.09.002] [PMID: 30201554]
[37]
Feng P, Yang H, Ding H, Lin H, Chen W, Chou KC. iDNA6mA-PseKNC: Identifying DNA N6-methyladenosine sites by incorporating nucleotide physicochemical properties into PseKNC. Genomics 2019; 111(1): 96-102.
[http://dx.doi.org/10.1016/j.ygeno.2018.01.005] [PMID: 29360500]
[38]
Chen W, Feng PM, Deng EZ, Lin H, Chou KC. iTIS-PseTNC: a sequence-based predictor for identifying translation initiation site in human genes using pseudo trinucleotide composition. Anal Biochem 2014; 462: 76-83.
[http://dx.doi.org/10.1016/j.ab.2014.06.022] [PMID: 25016190]
[39]
Chen W, Feng PM, Lin H, Chou KC. Pse DNC.. iSS-PseDNC: identifying splicing sites using pseudo dinucleotide composition. BioMed Res Int 2014; 2014 623149
[http://dx.doi.org/10.1155/2014/623149] [PMID: 24967386]
[40]
Guo SH, Deng EZ, Xu LQ, et al. iNuc-PseKNC: a sequence-based predictor for predicting nucleosome positioning in genomes with pseudo k-tuple nucleotide composition. Bioinformatics 2014; 30(11): 1522-9.
[http://dx.doi.org/10.1093/bioinformatics/btu083] [PMID: 24504871]
[41]
Li WC, Deng EZ, Ding H, Chen W, Lin H. iORI-PseKNC: a predictor for identifying origin of replication with pseudo k-tuple nucleotide composition. Chemom Intell Lab Syst 2015; 141: 100-6.
[http://dx.doi.org/10.1016/j.chemolab.2014.12.011]
[42]
Lin H, Deng EZ, Ding H, Chen W, Chou KC. iPro54-PseKNC: a sequence-based predictor for identifying sigma-54 promoters in prokaryote with pseudo k-tuple nucleotide composition. Nucleic Acids Res 2014; 42(21): 12961-72.
[http://dx.doi.org/10.1093/nar/gku1019] [PMID: 25361964]
[43]
Yang H, Qiu WR, Liu G, et al. iRSpot-Pse6NC: Identifying recombination spots in Saccharomyces cerevisiae by incorporating hexamer composition into general PseKNC. Int J Biol Sci 2018; 14(8): 883-91.
[http://dx.doi.org/10.7150/ijbs.24616] [PMID: 29989083]
[44]
He W, Jia C, Zou Q. 4mCPred: machine learning methods for DNA N4-methylcytosine sites prediction. Bioinformatics 2019; 35(4): 593-601.
[http://dx.doi.org/10.1093/bioinformatics/bty668] [PMID: 30052767]
[45]
Chen W, Lei TY, Jin DC, Lin H, Chou KC. PseKNC: a flexible web server for generating pseudo K-tuple nucleotide composition. Anal Biochem 2014; 456: 53-60.
[http://dx.doi.org/10.1016/j.ab.2014.04.001] [PMID: 24732113]
[46]
Chen W, Zhang X, Brooker J, Lin H, Zhang L, Chou KC. PseKNC-General: a cross-platform package for generating various modes of pseudo nucleotide compositions. Bioinformatics 2015; 31(1): 119-20.
[http://dx.doi.org/10.1093/bioinformatics/btu602] [PMID: 25231908]
[47]
Chen W, Lin H, Chou KC. Pseudo nucleotide composition or PseKNC: an effective formulation for analyzing genomic sequences. Mol Biosyst 2015; 11(10): 2620-34.
[http://dx.doi.org/10.1039/C5MB00155B] [PMID: 26099739]
[48]
Feng PM, Chen W, Lin H, Chou KC. iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition. Anal Biochem 2013; 442(1): 118-25.
[http://dx.doi.org/10.1016/j.ab.2013.05.024] [PMID: 23756733]
[49]
Feng PM, Ding H, Chen W, Lin H. Naïve Bayes classifier with feature selection to identify phage virion proteins. Comput Math Methods Med 2013; 2013 530696
[http://dx.doi.org/10.1155/2013/530696] [PMID: 23762187]
[50]
Lin H, Liang ZY, Tang H, Chen W. Identifying sigma70 promoters with novel pseudo nucleotide composition. IEEE/ACM Trans Comput Biol Bioinformatics 2019; 16(4): 1316-21.
[PMID: 28186907]
[51]
Chen W, Feng P, Liu T, Jin D. Recent advances in machine learning methods for predicting heat shock proteins. Curr Drug Metab 2019; 20(3): 224-8.
[PMID: 30378494]
[52]
Tan JX, Li SH, Zhang ZM, et al. Identification of hormone binding proteins based on machine learning methods. Math Biosci Eng 2019; 16(4): 2466-80.
[http://dx.doi.org/10.3934/mbe.2019123] [PMID: 31137222]
[53]
Feng CQ, Zhang ZY, Zhu XJ, et al. iTerm-PseKNC: a sequence-based tool for predicting bacterial transcriptional terminators. Bioinformatics 2019; 35(9): 1469-77.
[PMID: 30247625]
[54]
Dao FY, Lv H, Wang F, et al. Identify origin of replication in Saccharomyces cerevisiae using two-step feature selection technique. Bioinformatics 2019; 35(12): 2075-83.
[PMID: 30428009]
[55]
Du P, Tian Y, Yan Y. Subcellular localization prediction for human internal and organelle membrane proteins with projected gene ontology scores. J Theor Biol 2012; 313: 61-7.
[http://dx.doi.org/10.1016/j.jtbi.2012.08.016] [PMID: 22960368]
[56]
Jia C, Zuo Y. S-SulfPred: A sensitive predictor to capture S-sulfenylation sites based on a resampling one-sided selection undersampling-synthetic minority oversampling technique. J Theor Biol 2017; 422: 84-9.
[http://dx.doi.org/10.1016/j.jtbi.2017.03.031] [PMID: 28411111]
[57]
Lorenz R, Bernhart SH, Höner Zu Siederdissen C, et al. ViennaRNA Package 2.0. Algorithms Mol Biol 2011; 6: 26.
[http://dx.doi.org/10.1186/1748-7188-6-26] [PMID: 22115189]
[58]
Wei L, Su R, Wang B, Li X, Zou Q, Gao X. Integration of deep feature representations and handcrafted features to improve the prediction of N 6-methyladenosine sites. Neurocomputing 2019; 324: 3-9.
[http://dx.doi.org/10.1016/j.neucom.2018.04.082]
[59]
Wei L, Ding Y, Su R, Tang J, Zou Q. Prediction of human protein subcellular localization using deep learning. J Parallel Distrib Comput 2018; 117: 212-7.
[http://dx.doi.org/10.1016/j.jpdc.2017.08.009]
[60]
Peng L, Peng MM, Liao B, Huang GH, Li WB, Xie DF. The advances and challenges of deep learning application in biological big data processing. Curr Bioinform 2018; 13(4): 352-9.
[http://dx.doi.org/10.2174/1574893612666170707095707]
[61]
Su R, Liu X, Wei L, Zou Q. Deep-Resp-Forest: A deep forest model to predict anti-cancer drug response. Methods 2019; 166: 91-102.
[http://dx.doi.org/10.1016/j.ymeth.2019.02.009] [PMID: 30772464]
[62]
Cao R, Bhattacharya D, Hou J, Cheng J, Deep QA. DeepQA: improving the estimation of single protein model quality with deep belief networks. BMC Bioinformatics 2016; 17(1): 495.
[http://dx.doi.org/10.1186/s12859-016-1405-y] [PMID: 27919220]
[63]
Cao R, Freitas C, Chan L, Sun M, Jiang H, Chen Z. ProLanGO: protein function prediction using neural machine translation based on a recurrent neural network. Molecules 2017; 22(10) E1732
[http://dx.doi.org/10.3390/molecules22101732] [PMID: 29039790]
[64]
Li Y, Niu M, Zou Q. ELM-MHC: an improved MHC identification method with extreme learning machine algorithm. J Proteome Res 2019; 18(3): 1392-401.
[http://dx.doi.org/10.1021/acs.jproteome.9b00012] [PMID: 30698979]
[65]
Du P, Wang L. Predicting human protein subcellular locations by the ensemble of multiple predictors via protein-protein interaction network with edge clustering coefficients. PLoS One 2014; 9(1) e86879
[http://dx.doi.org/10.1371/journal.pone.0086879] [PMID: 24466278]
[66]
Manavalan B, Govindaraj RG, Shin TH, Kim MO, Lee G. iBCE-EL: a new ensemble learning framework for improved linear b-cell epitope prediction. Front Immunol 2018; 9: 1695.
[http://dx.doi.org/10.3389/fimmu.2018.01695] [PMID: 30100904]
[67]
Manavalan B, Shin TH, Kim MO, Lee G. PIP-EL: a new ensemble learning method for improved proinflammatory peptide predi-ctions. Front Immunol 2018; 9: 1783.
[http://dx.doi.org/10.3389/fimmu.2018.01783] [PMID: 30108593]
[68]
Peng H, Long F, Ding C. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 2005; 27(8): 1226-38.
[http://dx.doi.org/10.1109/TPAMI.2005.159] [PMID: 16119262]
[69]
Jiao YS, Du PF. Prediction of Golgi-resident protein types using general form of Chou’s pseudo-amino acid compositions: Approaches with minimal redundancy maximal relevance feature selection. J Theor Biol 2016; 402: 38-44.
[http://dx.doi.org/10.1016/j.jtbi.2016.04.032] [PMID: 27155042]
[70]
Zou Q, Zeng JC, Cao LJ, Zeng XX. A novel features ranking metric with application to scalable visual and bioinformatics data classification. Neurocomputing 2016; 173: 346-54.
[http://dx.doi.org/10.1016/j.neucom.2014.12.123]


Rights & PermissionsPrintExport Cite as

Article Details

VOLUME: 15
ISSUE: 4
Year: 2020
Page: [279 - 286]
Pages: 8
DOI: 10.2174/1574893614666191018171521
Price: $65

Article Metrics

PDF: 11
HTML: 2