Background: Many bioinformatics pipelines are available nowadays to analyze
transcriptomics data produced by high-throughput RNA sequencing. They implement different
workflows that address several analysis tasks, supported by the use of third party programs.
Nevertheless, a proper workflow definition for RNA-seq data analysis is still lacking.
Objective: To proper define what a comprehensive RNA-seq data analysis workflow should be.
Compare all available pipelines and, if such a solution is not available, implement a new pipeline.
Method: We have developed a new pipeline integrating state-of-the art programs for different parts of
the RNA-seq analysis. We also have used RUbioSeq libraries to achieve a scalable solution.
Results: We have defined a comprehensive RNA-seq data analysis workflow, comprising the most
common needs demanded by biologists and implemented it in a new pipeline, nextpresso. We also
validate it in two case studies presented here.
Conclusion: Nexpresso is a new, freely available, pipeline covering the most common needs of RNA-seq
data analysis. It is easy to configure, generates user friendly results and scales well for larger studies
comprising a high number of samples.