MyPhi: Efficient Levenshtein Distance Computation on Xeon Phi Based Architectures

Yuandong        Chan; Kai        Xu; Haidong        Lan; Bertil        Schmidt; Shaoliang        Peng; Weiguo        Liu

Abstract

Background: Approximate string matching algorithms are widely used in bioinformatics, among which the bit-parallel Myers algorithm is a popular approach to compute the Levenshtein distance between two genome sequences. The bit vector encoding of the Myers algorithm makes it feasible to extend to modern parallel architectures with wider-than-ever vector registers and many cores.

Objective: Myers algorithm has already been integrated into some NGS all mappers such as RazerS and GEM for the verification stage. Due to the huge number of NGS reads to be processed, it is demanded to accelerate the bit-parallel Myers algorithm for higher throughput of NGS all mappers. In this paper, we aim to design an ultra-fast implementation of Myers algorithm on Intel Xeon Phi based architectures, including KNL-based processors and KNC-based co-processors.

Method: We designed a two-level framework to fully exploit the computing power of Xeon Phi based many-core architectures. At the coarse-grained thread level, we used multi-threading to invoke many cores. At the fine-grained VPU level, we proposed a novel vectorized computing method for the Myers algorithm. The in-depth analysis for memory access leads to a more cache friendly searching strategy.

Results: Performance evaluation revealed that MyPhi achieved a peak performance of 1.03 and 1.62 TCUPS (Trillion Cell Updates per Second) on KNC-based Xeon Phi 7110 co-processor and KNL-based Xeon Phi 7210 processor, respectively, which outperformed a multi-threaded scalar implementation on dual six-core CPUs by an average speed up of 8.95 and 14.08.

Conclusion: We presented the MyPhi to compute the Levenshtein distance between the two strings efficiently on Xeon Phi based architectures. Performance evaluation has shown good speedups over other CPU-based and accelerator-enabled works as well as good scalability. MyPhi can be further used as building blocks for short read aligners, clustering algorithms and potentially other sequence aligning tools.

Keywords: Approximate string matching, myers algorithm, levenshtein distance, SIMD, Xeon Phi, KNC, KNL.

« Previous Next »

Graphical Abstract

Rights & Permissions Print Cite

Article Metrics

35

1

Journal Information

For Authors

For Editors

For Reviewers

Explore Articles

Open Access

Open Access Articles

For Visitors

DOI https://dx.doi.org/10.2174/1574893612666171122150933	Print ISSN 1574-8936
Publisher Name Bentham Science Publisher	Online ISSN 2212-392X

Current Bioinformatics

MyPhi: Efficient Levenshtein Distance Computation on Xeon Phi Based Architectures

Abstract

Graphical Abstract

Related Journals

Related Books