A Streaming Accelerator for Deep Convolutional Neural Networks with Image and Feature Decomposition for Resource-limited System Applications

09/15/2017
by   Yuan Du, et al.
0

Deep convolutional neural networks (CNN) are widely used in modern artificial intelligence (AI) and smart vision systems but also limited by computation latency, throughput, and energy efficiency on a resource-limited scenario, such as mobile devices, internet of things (IoT), unmanned aerial vehicles (UAV), and so on. A hardware streaming architecture is proposed to accelerate convolution and pooling computations for state-of-the-art deep CNNs. It is optimized for energy efficiency by maximizing local data reuse to reduce off-chip DRAM data access. In addition, image and feature decomposition techniques are introduced to optimize memory access pattern for an arbitrary size of image and number of features within limited on-chip SRAM capacity. A prototype accelerator was implemented in TSMC 65 nm CMOS technology with 2.3 mm x 0.8 mm core area, which achieves 144 GOPS peak throughput and 0.8 TOPS/W peak energy efficiency.

READ FULL TEXT

page 1

page 4

research
07/08/2017

A Reconfigurable Streaming Deep Convolutional Neural Network Accelerator for Internet of Things

Convolutional neural network (CNN) offers significant accuracy in image ...
research
03/05/2018

XNORBIN: A 95 TOp/s/W Hardware Accelerator for Binary Convolutional Neural Networks

Deploying state-of-the-art CNNs requires power-hungry processors and off...
research
06/25/2019

A Winograd-based Integrated Photonics Accelerator for Convolutional Neural Networks

Neural Networks (NNs) have become the mainstream technology in the artif...
research
01/18/2018

ECA: Energy-Efficient FPGA-based Convolutional Neural Networks Architecture for Single Image Super-Resolution

Convolutional neural networks (CNN) show the excellent performance compa...
research
10/01/2020

CARLA: A Convolution Accelerator with a Reconfigurable and Low-Energy Architecture

Convolutional Neural Networks (CNNs) have proven to be extremely accurat...
research
03/04/2017

Chain-NN: An Energy-Efficient 1D Chain Architecture for Accelerating Deep Convolutional Neural Networks

Deep convolutional neural networks (CNN) have shown their good performan...
research
03/29/2018

Euphrates: Algorithm-SoC Co-Design for Low-Power Mobile Continuous Vision

Continuous computer vision (CV) tasks increasingly rely on convolutional...

Please sign up or login with your details

Forgot password? Click here to reset