pLUTo: In-DRAM Lookup Tables to Enable Massively Parallel General-Purpose Computation

04/15/2021
by   João Dinis Ferreira, et al.
0

Data movement between main memory and the processor is a significant contributor to the execution time and energy consumption of memory-intensive applications. This data movement bottleneck can be alleviated using Processing-in-Memory (PiM), which enables computation inside the memory chip. However, existing PiM architectures often lack support for complex operations, since supporting these operations increases design complexity, chip area, and power consumption. We introduce pLUTo (processing-in-memory with lookup table [LUT] operations), a new DRAM substrate that leverages the high area density of DRAM to enable the massively parallel storing and querying of lookup tables (LUTs). The use of LUTs enables the efficient execution of complex operations in-memory, which has been a long-standing challenge in the domain of PiM. When running a state-of-the-art binary neural network in a single DRAM subarray, pLUTo outperforms the baseline CPU and GPU implementations by 33× and 8×, respectively, while simultaneously achieving energy savings of 110× and 80×.

READ FULL TEXT

page 3

page 4

page 6

page 9

research
12/22/2020

SIMDRAM: A Framework for Bit-Serial SIMD Processing Using DRAM

Processing-using-DRAM has been proposed for a limited set of basic opera...
research
06/30/2023

HashMem: PIM-based Hashmap Accelerator

Hashmaps are widely utilized data structures in many applications to per...
research
01/23/2017

Neurostream: Scalable and Energy Efficient Deep Learning with Smart Memory Cubes

High-performance computing systems are moving towards 2.5D and 3D memory...
research
10/18/2021

In-memory Multi-valued Associative Processor

In-memory associative processor architectures are offered as a great can...
research
06/01/2022

YOLoC: DeploY Large-Scale Neural Network by ROM-based Computing-in-Memory using ResiduaL Branch on a Chip

Computing-in-memory (CiM) is a promising technique to achieve high energ...
research
10/19/2022

Scalable Coherent Optical Crossbar Architecture using PCM for AI Acceleration

Optical computing has been recently proposed as a new compute paradigm t...
research
06/29/2017

Fast Processing of Large Graph Applications Using Asynchronous Architecture

Graph algorithms and techniques are increasingly being used in scientifi...

Please sign up or login with your details

Forgot password? Click here to reset