The process of characterizing naturally occurring variations in the human genome has captivated the high performance computation community over the past few years. Changes known as biallelic Single-Nucleotide Polymorphisms (SNPs) have become essential biomarkers both in evolutionary relationships and propensity to degenerative diseases. It is being increasingly accepted that traditional statistical SNP analysis of Genome-Wide Association Studies (GWAS) reveals just a small part of the heritability in complex diseases. Study of interactions among SNPs has been suggested as a plausible approach to identify further SNPs that contribute to disease but either do not reach genome-wide significance or exhibit only epistatic effects. We have introduced a methodology for genome-wide screening of epistatic
interactions which is feasible to be handled by state-of-art high performance computing technology. Unlike standard software , our method computes all Boolean binary interactions between SNPs across the whole genome without assuming a particular model of interaction. Our extensive search for epistasis comes at the expense of higher computational complexity, which we tackled using graphics processors (GPUs) to reduce the computational time from several months in a cluster of CPUs to 3-4 days on a multi-GPU platform .
Our work also contributes with a new entropy-based function to evaluate the interaction between SNPs which does not compromise findings about the most significant SNP interactions, but is more than 4000 times lighter in terms of computational time when running on GPUs and provides more than 100x faster code than a CPU of similar cost. We deploy a number of optimization techniques to tune the implementation of this function using CUDA and show the way to enhance scalability on larger data sets. The role of our implementation as accelerator is discussed on a wide variety of GPUs from Nvidia, including the three more popular profiles in graphics computing: High-end cards targeted to High Performance Computing (Tesla), top sellers for video gamers (GeForce) and emerging low power devices for mobile computing (Tegra). We analyze pros and cons of each approach and perform a study of their evolution from Fermi (2012) to Kepler (2014) generations, showing what they can contribute to speed up computationally demanding biomedical codes like ours.