ClaPIM: Scalable Sequence CLAssification using Processing-In-Memory

02/16/2023
by   Marcel Khalifa, et al.
0

DNA sequence classification is a fundamental task in computational biology with vast implications for applications such as disease prevention and drug design. Therefore, fast high-quality sequence classifiers are significantly important. This paper introduces ClaPIM, a scalable DNA sequence classification architecture based on the emerging concept of hybrid in-crossbar and near-crossbar memristive processing-in-memory (PIM). We enable efficient and high-quality classification by uniting the filter and search stages within a single algorithm. Specifically, we propose a custom filtering technique that drastically narrows the search space and a search approach that facilitates approximate string matching through a distance function. ClaPIM is the first PIM architecture for scalable approximate string matching that benefits from the high density of memristive crossbar arrays and the massive computational parallelism of PIM. Compared with Kraken2, a state-of-the-art software classifier, ClaPIM provides significantly higher classification quality (up to 20x improvement in F1 score) and also demonstrates a 1.8x throughput improvement. Compared with EDAM, a recently-proposed SRAM-based accelerator that is restricted to small datasets, we observe both a 30.4x improvement in normalized throughput per area and a 7

READ FULL TEXT

page 1

page 4

page 6

page 7

page 11

research
09/07/2017

A Non-volatile Near-Memory Read Mapping Accelerator

DNA sequencing entails the process of determining the precise physical o...
research
06/12/2016

Application-Driven Near-Data Processing for Similarity Search

Similarity search is a key to a variety of applications including conten...
research
09/16/2020

GenASM: A High-Performance, Low-Power Approximate String Matching Acceleration Framework for Genome Sequence Analysis

Genome sequence analysis has enabled significant advancements in medical...
research
05/31/2022

DNA Pattern Matching Acceleration with Analog Resistive CAM

DNA pattern matching is essential for many widely used bioinformatics ap...
research
12/10/2017

A novel algorithm for online inexact string matching and its FPGA implementation

Accelerating inexact string matching procedures is of utmost importance ...
research
11/17/2022

Knowledge distillation for fast and accurate DNA sequence correction

Accurate genome sequencing can improve our understanding of biology and ...
research
04/24/2017

GaKCo: a Fast GApped k-mer string Kernel using COunting

String Kernel (SK) techniques, especially those using gapped k-mers as f...

Please sign up or login with your details

Forgot password? Click here to reset