Parallel Accelerated Vector Similarity Calculations for Genomics Applications

05/23/2017
by   Wayne Joubert, et al.
0

The surge in availability of genomic data holds promise for enabling determination of genetic causes of observed individual traits, with applications to problems such as discovery of the genetic roots of phenotypes, be they molecular phenotypes such as gene expression or metabolite concentrations, or complex phenotypes such as diseases. However, the growing sizes of these datasets and the quadratic, cubic or higher scaling characteristics of the relevant algorithms pose a serious computational challenge necessitating use of leadership scale computing. In this paper we describe a new approach to performing vector similarity metrics calculations, suitable for parallel systems equipped with graphics processing units (GPUs) or Intel Xeon Phi processors. Our primary focus is the Proportional Similarity metric applied to Genome Wide Association Studies (GWAS) and Phenome Wide Association Studies (PheWAS). We describe the implementation of the algorithms on accelerated processors, methods used for eliminating redundant calculations due to symmetries, and techniques for efficient mapping of the calculations to many-node parallel systems. Results are presented demonstrating high per-node performance and parallel scalability with rates of more than five quadrillion elementwise comparisons achieved per second on the ORNL Titan system. In a companion paper we describe corresponding techniques applied to calculations of the Custom Correlation Coefficient for comparative genomics applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2017

Parallel Accelerated Custom Correlation Coefficient Calculations for Genomics Applications

The massive quantities of genomic data being made available through gene...
research
01/09/2008

Toward the Graphics Turing Scale on a Blue Gene Supercomputer

We investigate raytracing performance that can be achieved on a class of...
research
01/28/2022

Experiences with managing data parallel computational workflows for High-throughput Fragment Molecular Orbital (FMO) Calculations

Fragment Molecular Orbital (FMO) calculations provide a framework to spe...
research
10/10/2017

RLT2-based Parallel Algorithms for Solving Large Quadratic Assignment Problems on Graphics Processing Unit Clusters

This paper discusses efficient parallel algorithms for obtaining strong ...
research
11/10/2012

Efficient network-guided multi-locus association mapping with graph cuts

As an increasing number of genome-wide association studies reveal the li...
research
04/05/2023

Accelerated high-cycle phase field fatigue predictions

Phase field fracture models have seen widespread application in the last...

Please sign up or login with your details

Forgot password? Click here to reset