SMART Paths for Latency Reduction in ReRAM Processing-In-Memory Architecture for CNN Inference

04/10/2020
by   Sho Ko, et al.
0

This research work proposes a design of an analog ReRAM-based PIM (processing-in-memory) architecture for fast and efficient CNN (convolutional neural network) inference. For the overall architecture, we use the basic hardware hierarchy such as node, tile, core, and subarray. On the top of that, we design intra-layer pipelining, inter-layer pipelining, and batch pipelining to exploit parallelism in the architecture and increase overall throughput for the inference of an input image stream. We also optimize the performance of the NoC (network-on-chip) routers by decreasing hop counts using SMART (single-cycle multi-hop asynchronous repeated traversal) flow control. Finally, we experiment with weight replications for different CNN layers in VGG (A-E) for large-scale data set ImageNet. In our simulation, we achieve 40.4027 TOPS (tera-operations per second) for the best-case performance, which corresponds to over 1029 FPS (frames per second). We also achieve 3.5914 TOPS/W (tera-operaions per second per watt) for the best-case energy efficiency. In addition, the architecture with aggressive pipelining and weight replications can achieve 14X speedup compared to the baseline architecture with basic pipelining, and SMART flow control achieves 1.08X speedup in this architecture compared to the baseline. Last but not least, we also evaluate the performance of SMART flow control using synthetic traffic.

READ FULL TEXT

page 1

page 4

research
10/13/2020

High Area/Energy Efficiency RRAM CNN Accelerator with Kernel-Reordering Weight Mapping Scheme Based on Pattern Pruning

Resistive Random Access Memory (RRAM) is an emerging device for processi...
research
07/06/2021

CAP-RAM: A Charge-Domain In-Memory Computing 6T-SRAM for Accurate and Precision-Programmable CNN Inference

A compact, accurate, and bitwidth-programmable in-memory computing (IMC)...
research
01/07/2020

HyGCN: A GCN Accelerator with Hybrid Architecture

In this work, we first characterize the hybrid execution patterns of GCN...
research
09/07/2023

Mapping of CNNs on multi-core RRAM-based CIM architectures

RRAM-based multi-core systems improve the energy efficiency and performa...
research
09/03/2021

SMART: A Heterogeneous Scratchpad Memory Architecture for Superconductor SFQ-based Systolic CNN Accelerators

Ultra-fast & low-power superconductor single-flux-quantum (SFQ)-based CN...
research
05/19/2020

In-memory Implementation of On-chip Trainable and Scalable ANN for AI/ML Applications

Traditional von Neumann architecture based processors become inefficient...
research
06/27/2021

OCCAM: Optimal Data Reuse for Convolutional Neural Networks

Convolutional neural networks (CNNs) are emerging as powerful tools for ...

Please sign up or login with your details

Forgot password? Click here to reset