Fast DNA Sequence Alignment Algorithm Based on Quality Score Using Improved Dynamic Programming and Fuzzy Gap Cost Control
Kwang Baek Kim,
Hyun Jun Park,
Doo Heon Song.
A sequence alignment algorithm is a basic building block for protein analysis and nucleic acid analysis in
bioinformatics. Such alignment represents the similarities and differences of two or more compared sequences. Thus,
there have been many algorithms and tools studied and developed. In this paper, we focus on the PHRED based sequence
alignment algorithm using Needleman-Wunsch dynamic programming. Although it is well known and proven to be
reliable to some extent, it suffers from the heavy computation of producing scoring matrix based on dynamic
programming whose time complexity is O(mn). We propose a method applying quadrant method in that process to reduce
the computational loads. Also, PHRED based algorithms suffer from the environment when low quality bases are
frequently in tips of DNA fragments. Thus, we designed a fuzzy logic system to control the gap cost dynamically to
improve the quality of the alignment. In the experiment using real genome data from NCBI (National Center for
Biotechnology Information), we verify that the proposed method reduces the computational loads by half in producing
scoring matrix and thus the alignment quality is also improved by our fuzzy inference system.
Keywords: DNA sequence alignment, dynamic programming, fuzzy inference system, gap cost, quadrant, quality score.
Rights & PermissionsPrintExport