Searching Exact Tandem Repeats in DNA Sequences Using Enhanced Suffix Array

Author(s): Shivika Gupta* , Rajesh Prasad .

Journal Name: Current Bioinformatics

Volume 13 , Issue 2 , 2018

Become EABM
Become Reviewer

Graphical Abstract:


Abstract:

Background: Genomes of organisms contains a variety of repeated structures of various lengths and type, interspersed or tandem. Tandem repeats play important role in molecular biology as they are related to genetic backgrounds of inherited diseases, and also they can serve as markers for DNA mapping and DNA fingerprinting. Improving the efficiency of algorithms for searching the tandem repeats in DNA sequences can lead to many useful applications in the area of genomics.

Objective: We introduce an efficient algorithm of O(n) for searching the maximum length exact tandem repeats in genomes.

Method: Algorithm is based on the use of the Enhanced Suffix Array (ESA). ESA consists of Suffix Array (SA) and Longest Common Prefix (LCP) array. SA is an array of all sorted suffixes of a string and LCP array stores the lengths of the longest common prefixes between all pairs of consecutive suffixes in a sorted suffix array.

Results: We compare the results of our computation with other existing application: Burrows Wheeler Tandem Repeat Searcher (BWtrs) for searching the exact tandem repeats. We provided an open source standalone application called TR-ESA (available at: www.algorithms-akgec-shivika.in/tandem), which implements searching of exact maximum length tandem repeat.

Conclusion: Tool is remarkably efficient and powerful which allows the analysis of complete genomes having exact tandem repeats.

Keywords: DNA, tandem repeats, exact tandem repeats, microsatellites, enhanced suffix array.

Rights & PermissionsPrintExport Cite as

Article Details

VOLUME: 13
ISSUE: 2
Year: 2018
Page: [216 - 222]
Pages: 7
DOI: 10.2174/1574893612666170529120424
Price: $58

Article Metrics

PDF: 11
HTML: 2
EPUB: 1
PRC: 1