CAMA: Energy and Memory Efficient Automata Processing in Content-Addressable Memories

12/01/2021
by   Yi Huang, et al.
0

Accelerating finite automata processing is critical for advancing real-time analytic in pattern matching, data mining, bioinformatics, intrusion detection, and machine learning. Recent in-memory automata accelerators leveraging SRAMs and DRAMs have shown exciting improvements over conventional digital designs. However, the bit-vector representation of state transitions used by all SOTA designs is only optimal in processing worst-case completely random patterns, while a significant amount of memory and energy is wasted in running most real-world benchmarks. We present CAMA, a Content-Addressable Memory (CAM) enabled Automata accelerator for processing homogeneous non-deterministic finite automata (NFA). A radically different state representation scheme, along with co-designed novel circuits and data encoding schemes, greatly reduces energy, memory, and chip area for most realistic NFAs. CAMA is holistically optimized with the following major contributions: (1) a 16x256 8-transistor (8T) CAM array for state matching, replacing the 256x256 6T SRAM array or two 16x256 6T SRAM banks in SOTA designs; (2) a novel encoding scheme that enables content searching within 8T SRAMs and adapts to different applications; (3) a reconfigurable and scalable architecture that improves efficiency on all tested benchmarks, without losing support for any NFA that is compatible with SOTA designs; (4) an optimization framework that automates the choice of encoding schemes and maps a given NFA to the proposed hardware. Two versions of CAMA, one optimized for energy (CAMA-E) and the other for throughput (CAMA-T), are comprehensively evaluated in a 28nm CMOS process, and across 21 real-world and synthetic benchmarks. CAMA-E achieves 2.1x, 2.8x, and 2.04x lower energy than CA, 2-stride Impala, and eAP. CAMA-T shows 2.68x, 3.87x and 2.62x higher average compute density than 2-stride Impala, CA, and eAP.

READ FULL TEXT

page 1

page 5

page 11

research
07/07/2021

A Dual-Port 8-T CAM-Based Network Intrusion Detection Engine for IoT

This letter presents an energy- and memory-efficient pattern-matching en...
research
10/18/2022

Deterministic vs. Non Deterministic Finite Automata in Automata Processing

Linear-time pattern matching engines have seen promising results using F...
research
05/02/2022

VWA: Hardware Efficient Vectorwise Accelerator for Convolutional Neural Network

Hardware accelerators for convolution neural networks (CNNs) enable real...
research
07/17/2017

Deterministic Memory Abstraction and Supporting Multicore System Architecture

Poor timing predictability of multicore processors has been a long-stand...
research
08/15/2020

Breaking Barriers: Maximizing Array Utilization for Compute In-Memory Fabrics

Compute in-memory (CIM) is a promising technique that minimizes data tra...
research
06/22/2023

Faster Compression of Deterministic Finite Automata

Deterministic finite automata (DFA) are a classic tool for high throughp...
research
12/03/2021

Virtual Coset Coding for Encrypted Non-Volatile Memories with Multi-Level Cells

PCM is a popular backing memory for DRAM main memory in tiered memory sy...

Please sign up or login with your details

Forgot password? Click here to reset