Multi-level Parallelization of Genotype Imputation on Supercomputers

(E-pub Ahead of Print)

Author(s): Weiwen Zhang*, Long Wang, Theint Theint Aye, Juniarto Samsudin, Yongqing Zhu

Journal Name: Current Bioinformatics

Become EABM
Become Reviewer


Background: Genotype imputation as a service is developed to enable researchers to estimate genotypes on haplotyped data without performing whole genome sequencing. However, genotype imputation is computation intensive and thus it remains a challenge to satisfy the high performance requirement of genome wide association study (GWAS).

Objective: In this paper, we propose a high performance computing solution for genotype imputation on supercomputers to enhance its execution performance.

Method: We design and implement a multi-level parallelization that includes job level, process level and thread level parallelization, enabled by job scheduling management, message passing interface (MPI) and OpenMP, respectively. It involves job distribution, chunk partition and execution, parallelized iteration for imputation and data concatenation. Due to the design of multi-level parallelization, we can exploit the multi-machine/multi-core architecture to improve the performance of genotype imputation.

Results: Experiment results show that our proposed method can outperform the Hadoop-based implementation of genotype imputation. Moreover, we conduct the experiments on supercomputers to evaluate the performance of the proposed method. The evaluation shows that it can significantly shorten the execution time, thus improving the performance for genotype imputation.

Conclusion: The proposed multi-level parallelization, when deployed as an imputation as a service, will facilitate bioinformatics researchers in Singapore to conduct genotype imputation and enhance the association study.

Keywords: Genotype imputation, parallelization, high performance computing, supercomputers, bioinformatics, performance evaluation.

Rights & PermissionsPrintExport Cite as

Article Details

(E-pub Ahead of Print)
DOI: 10.2174/1574893615999200420071307
Price: $95