FPSA: A Full System Stack Solution for Reconfigurable ReRAM-based NN Accelerator Architecture

01/28/2019
by   Yu Ji, et al.
0

Neural Network (NN) accelerators with emerging ReRAM (resistive random access memory) technologies have been investigated as one of the promising solutions to address the memory wall challenge, due to the unique capability of processing-in-memory within ReRAM-crossbar-based processing elements (PEs). However, the high efficiency and high density advantages of ReRAM have not been fully utilized due to the huge communication demands among PEs and the overhead of peripheral circuits. In this paper, we propose a full system stack solution, composed of a reconfigurable architecture design, Field Programmable Synapse Array (FPSA) and its software system including neural synthesizer, temporal-to-spatial mapper, and placement & routing. We highly leverage the software system to make the hardware design compact and efficient. To satisfy the high-performance communication demand, we optimize it with a reconfigurable routing architecture and the placement & routing tool. To improve the computational density, we greatly simplify the PE circuit with the spiking schema and then adopt neural synthesizer to enable the high density computation-resources to support different kinds of NN operations. In addition, we provide spiking memory blocks (SMBs) and configurable logic blocks (CLBs) in hardware and leverage the temporal-to-spatial mapper to utilize them to balance the storage and computation requirements of NN. Owing to the end-to-end software system, we can efficiently deploy existing deep neural networks to FPSA. Evaluations show that, compared to one of state-of-the-art ReRAM-based NN accelerators, PRIME, the computational density of FPSA improves by 31x; for representative NNs, its inference performance can achieve up to 1000x speedup.

READ FULL TEXT

page 5

page 6

page 8

page 10

research
03/05/2020

Compiling Neural Networks for a Computational Memory Accelerator

Computational memory (CM) is a promising approach for accelerating infer...
research
11/10/2022

NEON: Enabling Efficient Support for Nonlinear Operations in Resistive RAM-based Neural Network Accelerators

Resistive Random-Access Memory (RRAM) is well-suited to accelerate neura...
research
04/13/2023

Algorithms and Hardware for Efficient Processing of Logic-based Neural Networks

Recent efforts to improve the performance of neural network (NN) acceler...
research
10/28/2019

Comparing domain wall synapse with other Non Volatile Memory devices for on-chip learning in Analog Hardware Neural Network

Resistive Random Access Memory (RRAM) and Phase Change Memory (PCM) devi...
research
04/26/2022

Bifrost: End-to-End Evaluation and Optimization of Reconfigurable DNN Accelerators

Reconfigurable accelerators for deep neural networks (DNNs) promise to i...
research
02/13/2020

NN-PARS: A Parallelized Neural Network Based Circuit Simulation Framework

The shrinking of transistor geometries as well as the increasing complex...
research
04/08/2021

Enabling Cross-Domain Communication: How to Bridge the Gap between AI and HW Engineers

A key issue in system design is the lack of communication between hardwa...

Please sign up or login with your details

Forgot password? Click here to reset