A Non-volatile Near-Memory Read Mapping Accelerator

09/07/2017
by   S. Karen Khatamifard, et al.
0

DNA sequencing entails the process of determining the precise physical order of the four bases (Adenine, Guanine, Cytosine, Thymine) in a DNA strand. As semiconductor technology revolutionized computing, DNA sequencing technology, termed often as Next Generation Sequencing (NGS), revolutionized genomic research. Modern NGS platforms can sequence millions of short DNA fragments in parallel. The resulting short DNA sequences are termed (short) reads. Mapping each read to a reference genome of the same species (which itself represents a full-fledged assembly of already sequenced reads), sequence mapping, is an emerging application. Sequence mapping enables detailed study of genetic variations, and thereby catalyzes personalized health care solutions. Due to the large scale of the problem, well-studied pair-wise sequence similarity detection (or sequence alignment) algorithms fall short of efficiently mapping individual reads to the reference genome. Mapping represents a search-heavy data-intensive operation and barely features any complex floating point arithmetic. Therefore, sequence mapping can greatly benefit from in- or near-memory search and processing. Fast parallel associative search enabled by Ternary Content Addressable Memory (TCAM) can particularly help, however CMOS-based TCAM implementations cannot accommodate the large memory footprint in an area and energy efficient manner, where non-volatile TCAM comes to rescue. Still, brute-force TCAM search over as large of a search space as sequence mapping demands consumes unacceptably high energy. This paper provides an effective solution to the energy problem to tap the potential of non-volatile TCAM for high-throughput, energy-efficient sequence mapping: BioCAM. BioCAM can improve the throughput of sequence mapping by 7.5x; the energy consumption, by 109.0x when compared to a highly-optimized software implementation for modern GPUs.

READ FULL TEXT

page 3

page 9

research
01/17/2019

BioSEAL: In-Memory Biological Sequence Alignment Accelerator for Large-Scale Genomic Data

Genome sequences contain hundreds of millions of DNA base pairs. Finding...
research
05/26/2015

Large-scale Machine Learning for Metagenomics Sequence Classification

Metagenomics characterizes the taxonomic diversity of microbial communit...
research
02/16/2023

ClaPIM: Scalable Sequence CLAssification using Processing-In-Memory

DNA sequence classification is a fundamental task in computational biolo...
research
08/14/2020

PANDA: Processing-in-MRAM Accelerated De Bruijn Graph based DNA Assembly

Spurred by widening gap between data processing speed and data communica...
research
11/21/2017

Accelerating K-mer Frequency Counting with GPU and Non-Volatile Memory

The emergence of Next Generation Sequencing (NGS) platforms has increase...
research
11/18/2016

Fast low-level pattern matching algorithm

This paper focuses on pattern matching in the DNA sequence. It was inspi...

Please sign up or login with your details

Forgot password? Click here to reset