ROMANet: Fine-Grained Reuse-Driven Data Organization and Off-Chip Memory Access Management for Deep Neural Network Accelerators

Many hardware accelerators have been proposed to improve the computational efficiency of the inference process in deep neural networks (DNNs). However, off-chip memory accesses, being the most energy consuming operation in such architectures, limit the designs from achieving efficiency gains at the full potential. Towards this, we propose ROMANet, a methodology to investigate efficient dataflow patterns for reducing the number of the off-chip accesses. ROMANet adaptively determine the data reuse patterns for each convolutional layer of a network by considering the reuse factor of weights, input activations, and output activations. It also considers the data mapping inside off-chip memory to reduce row buffer misses and increase parallelism. Our experimental results show that ROMANet methodology is able to achieve up to 50 dynamic energy savings in state-of-the-art DNN accelerators.

READ FULL TEXT
research
02/04/2019

CapStore: Energy-Efficient Design and Management of the On-Chip Memory for CapsuleNet Inference Accelerators

Deep Neural Networks (DNNs) have been established as the state-of-the-ar...
research
01/07/2019

HyPar: Towards Hybrid Parallelism for Deep Learning Accelerator Array

With the rise of artificial intelligence in recent years, Deep Neural Ne...
research
07/10/2018

Eyeriss v2: A Flexible and High-Performance Accelerator for Emerging Deep Neural Networks

The design of DNNs has increasingly focused on reducing the computationa...
research
07/13/2021

FLAT: An Optimized Dataflow for Mitigating Attention Performance Bottlenecks

Attention mechanisms form the backbone of state-of-the-art machine learn...
research
05/04/2018

MAESTRO: An Open-source Infrastructure for Modeling Dataflows within Deep Learning Accelerators

We present MAESTRO, a framework to describe and analyze CNN dataflows, a...
research
11/06/2020

Mapping Stencils on Coarse-grained Reconfigurable Spatial Architecture

Stencils represent a class of computational patterns where an output gri...
research
07/18/2021

Domino: A Tailored Network-on-Chip Architecture to Enable Highly Localized Inter- and Intra-Memory DNN Computing

The ever-increasing computation complexity of fast-growing Deep Neural N...

Please sign up or login with your details

Forgot password? Click here to reset