Background: The advent of the single-molecule real-time (SMRT) isoform sequencing (Iso-Seq) has paved the way to obtain longer full-length transcripts. This method was found to be much superior in identifying full-length splice variants and other post-transcriptional events as compared to the next generation sequencing (NGS)-based short read sequencing (RNA-Seq). Several different bioinformatics tools to analyze the Iso-Seq data have been developed and some of them are still being refined to address different aspects of transcriptome complexity. However, a comprehensive summary of the available tools and their utility is still lacking.
Objective: Here, we summarized the existing Iso-Seq analysis tools and presented an integrated bioinformatics pipeline for Iso-Seq analysis, which overcomes the limitations of NGS and generates long contiguous full-length non-chimeric (FLNC) reads for analysis of post-transcriptional events.
Results: In this review, we summarized recent applications of Iso-Seq in plants, which include improved genome annotations, identification of novel genes and lncRNAs, identification of full-length splice isoforms, detection of novel alternative splicing (AS) and alternative polyadenylation (APA) events. In addition, we also discussed the bioinformatics pipeline for comprehensive Iso-Seq data analysis, including how to reduce the error rate in the reads and how to identify and quantify post-transcriptional events. Furthermore, visualization approach of Iso-Seq was discussed as well. Finally, we discussed methods to combine Iso-Seq data with RNA-Seq for transcriptome quantification.
Conclusion: Overall, this review demonstrates that the Iso-Seq is pivotal for analyzing transcriptome complexity and this new method offers unprecedented opportunities to comprehensively understand transcripts diversity.