Computational Approaches for Transcriptome Assembly Based on Sequencing Technologies

Yuwen       Luo; Xingyu       Liao; Fang-Xiang       Wu; Jianxin       Wang
Abstract

Transcriptome assembly plays a critical role in studying biological properties and examining the expression levels of genomes in specific cells. It is also the basis of many downstream analyses. With the increase of speed and the decrease in cost, massive sequencing data continues to accumulate. A large number of assembly strategies based on different computational methods and experiments have been developed. How to efficiently perform transcriptome assembly with high sensitivity and accuracy becomes a key issue. In this work, the issues with transcriptome assembly are explored based on different sequencing technologies. Specifically, transcriptome assemblies with next-generation sequencing reads are divided into reference-based assemblies and de novo assemblies. The examples of different species are used to illustrate that long reads produced by the third-generation sequencing technologies can cover fulllength transcripts without assemblies. In addition, different transcriptome assemblies using the Hybrid-seq methods and other tools are also summarized. Finally, we discuss the future directions of transcriptome assemblies.
Keywords: Transcriptome assembly, sequencing technologies, hybrid-Seq, full-length transcript, annotation, genomes.
« Previous Next »
Graphical Abstract

[1] 
Blencowe, B.J. Alternative splicing: new insights from global analyses. Cell,  2006, 126(1), 37-47.
[http://dx.doi.org/10.1016/j.cell.2006.06.023] [PMID:  16839875 ] 
[2] 
Ponting, C.P.; Oliver, P.L.; Reik, W. Evolution and functions of long noncoding RNAs. Cell,  2009, 136(4), 629-641.
[http://dx.doi.org/10.1016/j.cell.2009.02.006] [PMID:  19239885 ] 
[3] 
Cabili, M.N.; Trapnell, C.; Goff, L. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev.,  2011, 25(18), 1915-1927.
[http://dx.doi.org/10.1101/gad.17446611] [PMID:  21890647] 
[4] 
Wang, E.T.; Sandberg, R.; Luo, S. Alternative isoform regulation in human tissue transcriptomes. Nature,  2008, 456(7221), 470-476.
[http://dx.doi.org/10.1038/nature07509] [PMID:  18978772] 
[5] 
Kheterpal, I.; Scherer, J.R.; Clark, S.M. DNA sequencing using a four-color confocal fluorescence capillary array scanner. Electrophoresis,  1996, 17(12), 1852-1859.
[http://dx.doi.org/10.1002/elps.1150171209] [PMID:   9034766] 
[6] 
Sanger, F.; Nicklen, S.; Coulson, A.R. DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA,  1977, 74(12), 5463-5467.
[http://dx.doi.org/10.1073/pnas.74.12.5463] [PMID:  271968] 
[7] 
Li, X.; Kong, Y.; Zhao, Q-Y. De novo assembly of transcriptome from next-generation sequencing data. Quant. Biol.,  2016, 4, 94-105.
[http://dx.doi.org/10.1007/s40484-016-0069-y] 
[8] 
Margulies, M.; Egholm, M.; Altman, W.E. Genome sequencing in microfabricated high-density picolitre reactors. Nature,  2005, 437(7057), 376-380.
[http://dx.doi.org/10.1038/nature03959] [PMID:  16056220] 
[9] 
Bentley, DR; Balasubramanian, S; Swerdlow, HP Accurate whole human genome sequencing using reversible terminator chemistry nature, 2008 456: 53-9.
[10] 
Valouev, A.; Ichikawa, J.; Tonthat, T. A high-resolution, nucleosome position map of C. elegans reveals a lack of universal sequence-dictated positioning. Genome Res.,  2008, 18(7), 1051-1063.
[http://dx.doi.org/10.1101/gr.076463.108] [PMID:  18477713] 
[11] 
Luo, J.; Wang, J.; Shang, J. GapReduce: a gap filling algorithm based on partitioned read sets. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 2018.
[http://dx.doi.org/10.1109/TCBB.2018.2789909] [PMID:  29993951] 
[12] 
Trapnell, C.; Williams, B.A.; Pertea, G. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol.,  2010, 28(5), 511-515.
[http://dx.doi.org/10.1038/nbt.1621] [PMID:  20436464] 
[13] 
Guttman, M.; Garber, M.; Levin, J.Z. Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nat. Biotechnol.,  2010, 28(5), 503-510.
[http://dx.doi.org/10.1038/nbt.1633] [PMID:  20436462] 
[14] 
Feng, J.; Li, W.; Jiang, T. Inference of isoforms from short sequence reads. J. Comput. Biol.,  2011, 18(3), 305-321.
[http://dx.doi.org/10.1089/cmb.2010.0243] [PMID:  21385036] 
[15] 
Li, W.; Feng, J.; Jiang, T. IsoLasso: a LASSO regression approach to RNA-Seq based transcriptome assembly. International Conference on Research in Computational Molecular Biology,  , pp. 168-88.
[http://dx.doi.org/10.1007/978-3-642-20036-6_18] 
[16] 
Li, J.J.; Jiang, C-R.; Brown, J.B.; Huang, H.; Bickel, P.J. Sparse linear modeling of next-generation mRNA sequencing (RNA-Seq) data for isoform discovery and abundance estimation. Proc. Natl. Acad. Sci. USA,  2011, 108(50), 19867-19872.
[http://dx.doi.org/10.1073/pnas.1113972108] [PMID:  22135461] 
[17] 
Lin, Y-Y.; Dao, P.; Hach, F. Cliiq: Accurate comparative detection and quantification of expressed isoforms in a population. International Workshop on Algorithms in Bioinformatics,  , pp. 178-89.
[http://dx.doi.org/10.1007/978-3-642-33122-0_14] 
[18] 
Mezlini, A.M.; Smith, E.J.; Fiume, M. iReckon: simultaneous isoform discovery and abundance estimation from RNA-seq data. Genome Res.,  2013, 23(3), 519-529.
[http://dx.doi.org/10.1101/gr.142232.112] [PMID:  23204306] 
[19] 
Tomescu, A.I.; Kuosmanen, A.; Rizzi, R.; Mäkinen, V. A novel min-cost flow method for estimating transcript expression with RNA-Seq. In: BMC bioinformatics; BioMed Central, 2013, p. S15.
[http://dx.doi.org/10.1186/1471-2105-14-S5-S15] 
[20] 
Pertea, M.; Pertea, G.M.; Antonescu, C.M.; Chang, T.C.; Mendell, J.T.; Salzberg, S.L. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol.,  2015, 33(3), 290-295.
[http://dx.doi.org/10.1038/nbt.3122] [PMID:  25690850] 
[21] 
Shi, X.; Wang, X.; Wang, T-L.; Hilakivi-Clarke, L.; Clarke, R.; Xuan, J. SparseIso: a novel Bayesian approach to identify alternatively spliced isoforms from RNA-seq data. Bioinformatics,  2018, 34(1), 56-63.
[http://dx.doi.org/10.1093/bioinformatics/btx557] [PMID:  28968634] 
[22] 
Zerbino, D.R.; Birney, E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res.,  2008, 18(5), 821-829.
[http://dx.doi.org/10.1101/gr.074492.107] [PMID:  18349386] 
[23] 
Birol, I.; Jackman, S.D.; Nielsen, C.B. De novo transcriptome assembly with ABySS. Bioinformatics,  2009, 25(21), 2872-2877.
[http://dx.doi.org/10.1093/bioinformatics/btp367] [PMID:  19528083] 
[24] 
Robertson, G.; Schein, J.; Chiu, R. De novo assembly and analysis of RNA-seq data. Nat. Methods,  2010, 7(11), 909-912.
[http://dx.doi.org/10.1038/nmeth.1517] [PMID:  20935650] 
[25] 
Martin, J.; Bruno, V.M.; Fang, Z. Rnnotator: an automated de novo transcriptome assembly pipeline from stranded RNA-Seq reads. BMC Genomics,  2010, 11, 663.
[http://dx.doi.org/10.1186/1471-2164-11-663] [PMID:  21106091] 
[26] 
Grabherr, M.G.; Haas, B.J.; Yassour, M. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol.,  2011, 29(7), 644-652.
[http://dx.doi.org/10.1038/nbt.1883] [PMID:  21572440] 
[27] 
Schulz, M.H.; Zerbino, D.R.; Vingron, M.; Birney, E. Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics,  2012, 28(8), 1086-1092.
[http://dx.doi.org/10.1093/bioinformatics/bts094]] [PMID:  22368243] 
[28] 
Peng, Y.; Leung, H.C.; Yiu, S-M.; Lv, M.J.; Zhu, X.G.; Chin, F.Y. IDBA-tran: a more robust de novo de Bruijn graph assembler for transcriptomes with uneven expression levels. Bioinformatics,  2013, 29(13), i326-i334.
[http://dx.doi.org/10.1093/bioinformatics/btt219] [PMID:  23813001] 
[29] 
Chu, H-T.; Hsiao, W.W.; Chen, J-C. EBARDenovo: highly accurate de novo assembly of RNA-Seq with efficient chimera-detection. Bioinformatics,  2013, 29(8), 1004-1010.
[http://dx.doi.org/10.1093/bioinformatics/btt092] [PMID:  23457040 ] 
[30] 
Bao, E.; Jiang, T.; Girke, T. BRANCH: boosting RNA-Seq assemblies with partial or related genomic sequences. Bioinformatics,  2013, 29(10), 1250-1259.
[http://dx.doi.org/10.1093/bioinformatics/btt127] [PMID:  23493323] 
[31] 
Xie, Y.; Wu, G.; Tang, J. SOAPdenovo-Trans: de novo transcriptome assembly with short RNA-Seq reads. Bioinformatics,  2014, 30(12), 1660-1666.
[http://dx.doi.org/10.1093/bioinformatics/btu077] [PMID:  24532719] 
[32] 
Chang, Z.; Li, G.; Liu, J. Bridger: a new framework for de novo transcriptome assembly using RNA-seq data. Genome Biol.,  2015, 16, 30.
[http://dx.doi.org/10.1186/s13059-015-0596-2] [PMID:  25723335] 
[33] 
Liu, J.; Li, G.; Chang, Z. BinPacker: packing-based de novo transcriptome assembly from RNA-seq data. PLOS Comput. Biol.,  2016, 12(2)e1004772
[http://dx.doi.org/10.1371/journal.pcbi.1004772] [PMID:  26894997] 
[34] 
Luo, J.; Wang, J.; Zhang, Z.; Wu, F.X.; Li, M.; Pan, Y. EPGA: de novo assembly using the distributions of reads and insert size. Bioinformatics,  2015, 31(6), 825-833.
[http://dx.doi.org/10.1093/bioinformatics/btu762] [PMID:  25406329] 
[35] 
Rhoads, A.; Au, K.F. PacBio sequencing and its applications. Genomics Proteomics Bioinformatics,  2015, 13(5), 278-289.
[http://dx.doi.org/10.1016/j.gpb.2015.08.002] [PMID:  26542840] 
[36] 
Deonovic, B.; Wang, Y.; Weirather, J.; Wang, X.J.; Au, K.F. IDP-ASE: haplotyping and quantifying allele-specific expression at the gene and gene isoform level by hybrid sequencing. Nucleic Acids Res.,  2017, 45(5), e32-e2.
[http://dx.doi.org/10.1093/nar/gkw1076] [PMID:  27899656] 
[37] 
Weirather, J.L.; Afshar, P.T.; Clark, T.A. Characterization of fusion genes and the significantly expressed fusion isoforms in breast cancer by hybrid sequencing. Nucleic Acids Res.,  2015, 43(18), e116-e6.
[http://dx.doi.org/10.1093/nar/gkv562] [PMID:  26040699] 
[38] 
Au, K.F.; Sebastiano, V.; Afshar, P.T. Characterization of the human ESC transcriptome by hybrid sequencing. Proc. Natl. Acad. Sci. USA,  2013, 110(50), E4821-E4830.
[http://dx.doi.org/10.1073/pnas.1320101110] [PMID:  24282307] 
[39] 
Fu, S.; Ma, Y.; Yao, H. IDP-denovo: de novo transcriptome assembly and isoform annotation by hybrid sequencing. Bioinformatics,  2018, 34(13), 2168-2176.
[http://dx.doi.org/10.1093/bioinformatics/bty098] [PMID:  29905763] 
[40] 
Roulin, A.C.; Wu, M.; Pichon, S. De novo transcriptome hybrid assembly and validation in the European earwig (Dermaptera, Forficula auricularia). PLoS One,  2014, 9(4)e94098
[http://dx.doi.org/10.1371/journal.pone.0094098] [PMID:  24722757] 
[41] 
Zhao, Q-Y.; Wang, Y.; Kong, Y-M. et al. Optimizing de novo transcriptome assembly from short-read RNA-Seq data: a comparative study. In: BMC bioinformatics. BioMed Central 2011; p. S2.
[http://dx.doi.org/10.1186/1471-2105-12-S14-S2] 
[42] 
Garber, M.; Grabherr, M.G.; Guttman, M.; Trapnell, C. Computational methods for transcriptome annotation and quantification using RNA-seq. Nat. Methods,  2011, 8(6), 469-477.
[http://dx.doi.org/10.1038/nmeth.1613] [PMID:  21623353] 
[43] 
Li, M.; Liao, Z.; He, Y.; Wang, J.; Luo, J.; Pan, Y. ISEA: iterative seed-extension algorithm for de novo assembly using paired-end information and insert size distribution. IEEE/ACM Trans. Comput. Biol. Bioinformatics,  2017, 14(4), 916-925.
[http://dx.doi.org/10.1109/TCBB.2016.2550433] [PMID:  27076460] 
[44] 
Liao, X.; Li, M.; Zou, Y. Improving de novo assembly based on read classification. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 2018.
[http://dx.doi.org/10.1109/TCBB.2018.2861380] [PMID:  30059317] 
[45] 
Martin, J.A.; Wang, Z. Next-generation transcriptome assembly. Nat. Rev. Genet.,  2011, 12(10), 671-682.
[http://dx.doi.org/10.1038/nrg3068] [PMID:  21897427] 
[46] 
Surget-Groba, Y.; Montoya-Burgos, J.I. Optimization of de novo transcriptome assembly from next-generation sequencing data. Genome Res.,  2010, 20(10), 1432-1440.
[http://dx.doi.org/10.1101/gr.103846.109] [PMID:  20693479] 
[47] 
Wang, Y.; Yu, Y.; Pan, B. Optimizing hybrid assembly of next-generation sequence data from Enterococcus faecium: a microbe with highly divergent genome. BMC Syst. Biol.,  2012, 6(Suppl. 3), S21.
[http://dx.doi.org/10.1186/1752-0509-6-S3-S21] [PMID:  23282199] 
[48] 
Haas, B.J.; Zody, M.C. Advancing RNA-Seq analysis. Nat. Biotechnol.,  2010, 28(5), 421-423.
[http://dx.doi.org/10.1038/nbt0510-421] [PMID:  20458303] 
[49] 
Wu, B.; Li, M.; Liao, X. MEC: Misassembly Error Correction in contigs based on distribution of paired-end reads and statistics of GC-contents. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 2018.
[http://dx.doi.org/10.1109/TCBB.2018.2876855] [PMID:  30334805] 
[50] 
Li, M.; Tang, L.; Wu, F-X. SCOP: a novel scaffolding algorithm based on contig classification and optimization. Bioinformatics, 2018.
[http://dx.doi.org/10.1093/bioinformatics/bty773] [PMID:  30184046] 
[51] 
Kumar, S.; Blaxter, M.L. Comparing de novo assemblers for 454 transcriptome data. BMC Genomics,  2010, 11, 571.
[http://dx.doi.org/10.1186/1471-2164-11-571] [PMID:  20950480] 
[52] 
Mundry, M.; Bornberg-Bauer, E.; Sammeth, M.; Feulner, P.G. Evaluating characteristics of de novo assembly software on 454 transcriptome data: a simulation approach. PLoS One,  2012, 7(2) e31410
[http://dx.doi.org/10.1371/journal.pone.0031410] [PMID:  22384018] 
[53] 
Ren, X.; Liu, T.; Dong, J. Evaluating de Bruijn graph assemblers on 454 transcriptomic data. PLoS One,  2012, 7(12)e51188
[http://dx.doi.org/10.1371/journal.pone.0051188] [PMID:  23236450] 
[54] 
Trapnell, C.; Roberts, A.; Goff, L. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc.,  2012, 7(3), 562-578.
[http://dx.doi.org/10.1038/nprot.2012.016] [PMID:  22383036] 
[55] 
Pertea, M.; Kim, D.; Pertea, G.M.; Leek, J.T.; Salzberg, S.L. Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nat. Protoc.,  2016, 11(9), 1650-1667.
[http://dx.doi.org/10.1038/nprot.2016.095] [PMID:  27560171] 
[56] 
Shao, M.; Kingsford, C. Accurate assembly of transcripts through phase-preserving graph decomposition. Nat. Biotechnol.,  2017, 35(12), 1167-1169.
[http://dx.doi.org/10.1038/nbt.4020] [PMID:  29131147] 
[57] 
Luo, J.; Wang, J.; Li, W. EPGA2: memory-efficient de novo assembler. Bioinformatics,  2015, 31(24), 3988-3990.
[PMID:  26315905] 
[58] 
Luo, J.; Wang, J.; Zhang, Z.; Li, M.; Wu, F.X. BOSS: a novel scaffolding algorithm based on an optimized scaffold graph. Bioinformatics,  2017, 33(2), 169-176.
[http://dx.doi.org/10.1093/bioinformatics/btw597] [PMID:  27634951] 
[59] 
Kent, W.J. BLAT--the BLAST-like alignment tool. Genome Res.,  2002, 12(4), 656-664.
[http://dx.doi.org/10.1101/gr.229202] [PMID:  11932250] 
[60] 
Kim, D.; Langmead, B.; Salzberg, S.L. HISAT: a fast spliced aligner with low memory requirements. Nat. Methods,  2015, 12(4), 357-360.
[http://dx.doi.org/10.1038/nmeth.3317] [PMID:  25751142] 
[61] 
Trapnell, C.; Pachter, L.; Salzberg, S.L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics,  2009, 25(9), 1105-1111.
[http://dx.doi.org/10.1093/bioinformatics/btp120] [PMID:  19289445] 
[62] 
Langmead, B.; Salzberg, S.L. Fast gapped-read alignment with Bowtie 2. Nat. Methods,  2012, 9(4), 357-359.
[http://dx.doi.org/10.1038/nmeth.1923] [PMID:  22388286] 
[63] 
Wu, T.D.; Nacu, S. Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics,  2010, 26(7), 873-881.
[http://dx.doi.org/10.1093/bioinformatics/btq057] [PMID:  20147302] 
[64] 
Wang, K.; Singh, D.; Zeng, Z. MapSplice: accurate mapping of RNA-seq reads for splice junction discovery. Nucleic Acids Res.,  2010, 38(18), e178-e8.
[http://dx.doi.org/10.1093/nar/gkq622] [PMID:  20802226] 
[65] 
Au, K.F.; Jiang, H.; Lin, L.; Xing, Y.; Wong, W.H. Detection of splice junctions from paired-end RNA-seq data by SpliceMap. Nucleic Acids Res.,  2010, 38(14), 4570-4578.
[http://dx.doi.org/10.1093/nar/gkq211] [PMID:  20371516] 
[66] 
Mortazavi, A.; Williams, B.A.; McCue, K.; Schaeffer, L.; Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods,  2008, 5(7), 621-628.
[http://dx.doi.org/10.1038/nmeth.1226] [PMID:  18516045] 
[67] 
Pepke, S.; Wold, B.; Mortazavi, A. Computation for ChIP-seq and RNA-seq studies. Nat. Methods,  2009, 6(11)(Suppl.), S22-S32.
[http://dx.doi.org/10.1038/nmeth.1371] [PMID:  19844228] 
[68] 
Wu, T.T.; Chen, Y.F.; Hastie, T.; Sobel, E.; Lange, K. Genome-wide association analysis by lasso penalized logistic regression. Bioinformatics,  2009, 25(6), 714-721.
[http://dx.doi.org/10.1093/bioinformatics/btp041] [PMID:  19176549] 
[69] 
Zerbino, D; Birney, E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome research 2008; gr: 074492-107.
[http://dx.doi.org/10.1101/gr.074492.107] 
[70] 
Lu, B.; Zeng, Z.; Shi, T. Comparative study of de novo assembly and genome-guided assembly strategies for transcriptome reconstruction based on RNA-Seq. Sci. China Life Sci.,  2013, 56(2), 143-155.
[http://dx.doi.org/10.1007/s11427-013-4442-z] [PMID:  23393030] 
[71] 
Luo, R.; Liu, B.; Xie, Y. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. Gigascience,  2012, 1(1), 18.
[http://dx.doi.org/10.1186/2047-217X-1-18] [PMID:  23587118] 
[72] 
Wang, S.; Gribskov, M. Comprehensive evaluation of de novo transcriptome assembly programs and their effects on differential gene expression analysis. Bioinformatics,  2017, 33(3), 327-333.
[PMID:  28172640] 
[73] 
Steijger, T.; Abril, J.F.; Engström, P.G. RGASP Consortium. Assessment of transcript reconstruction methods for RNA-seq. Nat. Methods,  2013, 10(12), 1177-1184.
[http://dx.doi.org/10.1038/nmeth.2714] [PMID:  24185837] 
[74] 
Travers, K.J.; Chin, C-S.; Rank, D.R.; Eid, J.S.; Turner, S.W. A flexible and efficient template format for circular consensus sequencing and SNP detection. Nucleic Acids Res.,  2010, 38(15), e159-e9.
[http://dx.doi.org/10.1093/nar/gkq543] [PMID:  20571086] 
[75] 
Kuo, R.I.; Tseng, E.; Eory, L.; Paton, I.R.; Archibald, A.L.; Burt, D.W. Normalized long read RNA sequencing in chicken reveals transcriptome complexity similar to human. BMC Genomics,  2017, 18(1), 323.
[http://dx.doi.org/10.1186/s12864-017-3691-9] [PMID:  28438136] 
[76] 
Schadt, E.E.; Turner, S.; Kasarskis, A. A window into third-generation sequencing. Hum. Mol. Genet.,  2010, 19(R2), R227-R240.
[http://dx.doi.org/10.1093/hmg/ddq416] [PMID:  20858600] 
[77] 
Pushkarev, D.; Neff, N.F.; Quake, S.R. Single-molecule sequencing of an individual human genome. Nat. Biotechnol.,  2009, 27(9), 847-850.
[http://dx.doi.org/10.1038/nbt.1561] [PMID:  19668243] 
[78] 
Quail, M.A.; Smith, M.; Coupland, P. A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers. BMC Genomics,  2012, 13, 341.
[http://dx.doi.org/10.1186/1471-2164-13-341] [PMID:  22827831] 
[79] 
Korlach, J.; Bjornson, K.P.; Chaudhuri, B.P. Real-time DNA sequencing from single polymerase molecules.In: Methods in enzymology; Elsevier, 2010, pp. 431-455.
[80] 
Koren, S.; Phillippy, A.M. One chromosome, one contig: complete microbial genomes from long-read sequencing and assembly. Curr. Opin. Microbiol.,  2015, 23, 110-120.
[http://dx.doi.org/10.1016/j.mib.2014.11.014] [PMID:  25461581] 
[81] 
Pacific Biosciences. SMRT Sequencing: Consensus accuracy Available:. https://www.pacb.com/smrt-science/smrt-sequencing/accuracy/
[82] 
Manrao, E.A.; Derrington, I.M.; Laszlo, A.H. Reading DNA at single-nucleotide resolution with a mutant MspA nanopore and phi29 DNA polymerase. Nat. Biotechnol.,  2012, 30(4), 349-353.
[http://dx.doi.org/10.1038/nbt.2171] [PMID:  22446694] 
[83] 
Jain, M.; Fiddes, I.T.; Miga, K.H.; Olsen, H.E.; Paten, B.; Akeson, M. Improved data analysis for the MinION nanopore sequencer. Nat. Methods,  2015, 12(4), 351-356.
[http://dx.doi.org/10.1038/nmeth.3290] [PMID:   25686389] 
[84] 
Brown, C.G.; Clarke, J. Nanopore development at Oxford Nanopore. Nat. Biotechnol.,  2016, 34(8), 810-811.
[http://dx.doi.org/10.1038/nbt.3622] [PMID:  27504770] 
[85] 
Jain, M.; Koren, S.; Miga, K.H. Nanopore sequencing and assembly of a human genome with ultra-long reads. Nat. Biotechnol.,  2018, 36(4), 338-345.
[http://dx.doi.org/10.1038/nbt.4060] [PMID:  29431738] 
[86] 
Jain, M.; Tyson, J.R.; Loose, M. MinION analysis and reference consortium. MinION analysis and reference consortium: Phase 2 data release and analysis of R9.0 chemistry. F1000 Res.,  2017, 6, 760.
[http://dx.doi.org/10.12688/f1000research.11354.1] [PMID:  28794860] 
[87] 
Weirather, J.L.; de Cesare, M.; Wang, Y. Comprehensive comparison of Pacific Biosciences and Oxford Nanopore Technologies and their applications to transcriptome analysis. F1000 Res.,  2017, 6, 100.
[http://dx.doi.org/10.12688/f1000research.10571.2] [PMID:  28868132] 
[88] 
van Dijk, E.L.; Jaszczyszyn, Y.; Naquin, D.; Thermes, C. The third revolution in sequencing technology. Trends Genet.,  2018, 34(9), 666-681.
[http://dx.doi.org/10.1016/j.tig.2018.05.008] [PMID:  29941292] 
[89] 
Oxford Nanopore. 1D squared kit available in the store: boost accuracy, simple prep Available:. https://nanoporetech.com/about-us/news/1d-squared-kit-available-store-boost-accuracy-simple-prep
[90] 
Thomas, S.; Underwood, J.G.; Tseng, E.; Holloway, A.K. Bench To Basinet CvDC Informatics Subcommittee. Long-read sequencing of chicken transcripts and identification of new transcript isoforms. PLoS One,  2014, 9(4)e94650
[http://dx.doi.org/10.1371/journal.pone.0094650] [PMID:  24736250] 
[91] 
Tilgner, H.; Raha, D.; Habegger, L.; Mohiuddin, M.; Gerstein, M.; Snyder, M. Accurate identification and analysis of human mRNA isoforms using deep long read sequencing. G3 (Bethesda),  2013, 3(3), 387-397.
[http://dx.doi.org/10.1534/g3.112.004812] [PMID:  23450794] 
[92] 
Sharon, D.; Tilgner, H.; Grubert, F.; Snyder, M. A single-molecule long-read survey of the human transcriptome. Nat. Biotechnol.,  2013, 31(11), 1009-1014.
[http://dx.doi.org/10.1038/nbt.2705] [PMID:  24108091] 
[93] 
Tilgner, H.; Grubert, F.; Sharon, D.; Snyder, M.P. Defining a personal, allele-specific, and single-molecule long-read transcriptome. Proc. Natl. Acad. Sci. USA,  2014, 111(27), 9869-9874.
[http://dx.doi.org/10.1073/pnas.1400447111] [PMID:  24961374] 
[94] 
Chen, L.; Kostadima, M.; Martens, J.H.A. Transcriptional diversity during lineage commitment of human blood progenitors. Science,  2014, 345(6204)1251033
[http://dx.doi.org/10.1126/science.1251033] [PMID:  25258084] 
[95] 
Roberts, R.J.; Carneiro, M.O.; Schatz, M.C. The advantages of SMRT sequencing. Genome Biol.,  2013, 14(7), 405.
[http://dx.doi.org/10.1186/gb-2013-14-6-405] [PMID:  23822731] 
[96] 
Curwen, V.; Eyras, E.; Andrews, T.D. The Ensembl automatic gene annotation system. Genome Res.,  2004, 14(5), 942-950.
[http://dx.doi.org/10.1101/gr.1858004] [PMID:  15123590] 
[97] 
Potter, S.C.; Clarke, L.; Curwen, V. The Ensembl analysis pipeline. Genome Res.,  2004, 14(5), 934-941.
[http://dx.doi.org/10.1101/gr.1859804] [PMID:  15123589] 
[98] 
Johnson, J.M.; Castle, J.; Garrett-Engele, P. Genome-wide survey of human alternative pre-mRNA splicing with exon junction microarrays. Science,  2003, 302(5653), 2141-2144.
[http://dx.doi.org/10.1126/science.1090100] [PMID:  14684825] 
[99] 
Harrow, J.; Frankish, A.; Gonzalez, J.M. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res.,  2012, 22(9), 1760-1774.
[http://dx.doi.org/10.1101/gr.135350.111] [PMID:  22955987] 
[100] 
Pan, Q.; Shai, O.; Lee, L.J.; Frey, B.J.; Blencowe, B.J. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat. Genet.,  2008, 40(12), 1413-1415.
[http://dx.doi.org/10.1038/ng.259] [PMID:  18978789] 
[101] 
Barbosa-Morais, N.L.; Irimia, M.; Pan, Q. The evolutionary landscape of alternative splicing in vertebrate species. Science,  2012, 338(6114), 1587-1593.
[http://dx.doi.org/10.1126/science.1230612] [PMID:  23258890] 
[102] 
Merkin, J.; Russell, C.; Chen, P.; Burge, C.B. Evolutionary dynamics of gene and isoform regulation in Mammalian tissues. Science,  2012, 338(6114), 1593-1599.
[http://dx.doi.org/10.1126/science.1228186] [PMID:  23258891] 
[103] 
Leggett, R.M.; Clark, M.D. A world of opportunities with nanopore sequencing. J. Exp. Bot.,  2017, 68(20), 5419-5429.
[http://dx.doi.org/10.1093/jxb/erx289] [PMID:  28992056] 
[104] 
Garalde, D.R.; Snell, E.A.; Jachimowicz, D. Highly parallel direct RNA sequencing on an array of nanopores. Nat. Methods,  2018, 15(3), 201-206.
[http://dx.doi.org/10.1038/nmeth.4577] [PMID:  29334379] 
[105] 
Workman, R.E.; Tang, A.; Tang, P.S. Nanopore native RNA sequencing of a human poly (A) transcriptome. bioRxiv, 2018.
[106] 
Salmela, L.; Rivals, E. LoRDEC: accurate and efficient long read error correction. Bioinformatics,  2014, 30(24), 3506-3514.
[http://dx.doi.org/10.1093/bioinformatics/btu538] [PMID:  25165095] 
[107] 
Hackl, T.; Hedrich, R.; Schultz, J.; Förster, F. proovread: large-scale high-accuracy PacBio correction through iterative short read consensus. Bioinformatics,  2014, 30(21), 3004-3011.
[http://dx.doi.org/10.1093/bioinformatics/btu392] [PMID:  25015988] 
[108] 
Au, K.F.; Underwood, J.G.; Lee, L.; Wong, W.H. Improving PacBio long read accuracy by short read alignment. PLoS One,  2012, 7(10) e46679
[http://dx.doi.org/10.1371/journal.pone.0046679] [PMID:  23056399] 
[109] 
Koren, S.; Schatz, M.C.; Walenz, B.P. Adam M Phillippy. Hybrid error correction and de novo assembly of single-molecule sequencing reads. Nat. Biotechnol.,  2012, 30(7), 693-700.
[http://dx.doi.org/10.1038/nbt.2280] [PMID:  22750884] 
[110] 
Wu, T.D.; Watanabe, C.K. GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics,  2005, 21(9), 1859-1875.
[http://dx.doi.org/10.1093/bioinformatics/bti310] [PMID:  15728110] 
[111] 
Li, W.; Jaroszewski, L.; Godzik, A. Clustering of highly homologous sequences to reduce the size of large protein databases. Bioinformatics,  2001, 17(3), 282-283.
[http://dx.doi.org/10.1093/bioinformatics/17.3.282] [PMID:  11294794] 
[112] 
Sievers, F.; Wilm, A.; Dineen, D. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol.,  2011, 7, 539.
[http://dx.doi.org/10.1038/msb.2011.75] [PMID:  21988835] 
[113] 
O’Neil, S.T.; Emrich, S.J. Assessing De Novo transcriptome assembly metrics for consistency and utility. BMC Genomics,  2013, 14, 465.
[http://dx.doi.org/10.1186/1471-2164-14-465] [PMID:  23837739] 
[114] 
Salzberg, S.L.; Phillippy, A.M.; Zimin, A. GAGE: A critical evaluation of genome assemblies and assembly algorithms. Genome Res.,  2012, 22(3), 557-567.
[http://dx.doi.org/10.1101/gr.131383.111] [PMID:  22147368] 
[115] 
Smith-Unna, R.; Boursnell, C.; Patro, R.; Hibberd, J.M.; Kelly, S. TransRate: reference-free quality assessment of de novo transcriptome assemblies. Genome Res.,  2016, 26(8), 1134-1144.
[http://dx.doi.org/10.1101/gr.196469.115] [PMID:   27252236] 
[116] 
Clark, S.C.; Egan, R.; Frazier, P.I.; Wang, Z. ALE: a generic assembly likelihood evaluation framework for assessing the accuracy of genome and metagenome assemblies. Bioinformatics,  2013, 29(4), 435-443.
[http://dx.doi.org/10.1093/bioinformatics/bts723] [PMID:  23303509] 
[117] 
Zhou, S.; Liao, R.; Guan, J. When cloud computing meets bioinformatics: a review. J. Bioinform. Comput. Biol.,  2013, 11(5) 1330002
[http://dx.doi.org/10.1142/S0219720013300025] [PMID:  24131049] 
[118] 
Taylor, R.C. An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics.In: BMC bioinformatics; BioMed Central, 2010, p. S1.
[http://dx.doi.org/10.1186/1471-2105-11-S12-S1] 
Rights & Permissions Print Cite
Article Metrics
37
3
Journal Information
For Authors
For Editors
For Reviewers
Explore Articles
Open Access
Open Access Articles
For Visitors
DOI https://dx.doi.org/10.2174/1574893614666190410155603	Print ISSN 1574-8936
Publisher Name Bentham Science Publisher	Online ISSN 2212-392X
Current Bioinformatics

Computational Approaches for Transcriptome Assembly Based on Sequencing Technologies

Abstract

Graphical Abstract

Related Journals

Related Books