A Roadmap to Sequence Assembly Evaluation Tools

Author(s): Sara El-Metwally*, Eslam Hamouda, Mayada Tarek

Journal Name: Current Bioinformatics

Volume 16 , Issue 5 , 2021


Become EABM
Become Reviewer
Call for Editor

Graphical Abstract:


Abstract:

The assembly evaluation process is the starting step towards meaningful downstream data analysis. We need to know how much accurate information is included in an assembled sequence before going further to any data analysis stage. Four basic metrics are targeted by different assembly evaluation tools: contiguity, accuracy, completeness, and contamination. Some tools evaluate these metrics based on comparing the assembly results to a closely related reference. Others utilize different types of heuristics to overcome the missing guiding reference, such as the consistency between assembly results and sequencing reads. In this paper, we discuss the assembly evaluation process as a core stage in any sequence assembly pipeline and present a roadmap that is followed by most assembly evaluation tools to assess different metrics. We highlight the challenges that currently exist in the assembly evaluation tools and summarize their technical and practical details to help the end-users choose the best tool according to their working scenarios. To address the similarities/differences among different assembly assessment tools, including their evaluation approaches, metrics, comprehensive nature, limitations, usability and how the evaluated results are presented to the end-user, we provide a practical example for evaluating Velvet assembly results for S. aureus dataset from GAGE competition. A Github repository (https://github.com/SaraEl-Metwally/Assembly-Evaluation-Tools) is created for evaluation result details along with their generated command line parameters.

Keywords: Sequence assembly, evaluation tools, contiguity metrics, accuracy metrics, completeness, contamination metrics.

Rights & PermissionsPrintExport Cite as

Article Details

VOLUME: 16
ISSUE: 5
Year: 2021
Published on: 11 November, 2020
Page: [644 - 661]
Pages: 18
DOI: 10.2174/1574893615999201111140419
Price: $65

Article Metrics

PDF: 89