Hyperdrive: A Systolically Scalable Binary-Weight CNN Inference Engine for mW IoT End-Nodes

03/05/2018
by   Renzo Andri, et al.
0

Deep neural networks have achieved impressive results in computer vision and machine learning. Unfortunately, state-of-the-art networks are extremely compute- and memory-intensive which makes them unsuitable for mW-devices such as IoT end-nodes. Aggressive quantization of these networks dramatically reduces the computation and memory footprint. Binary-weight neural networks (BWNs) follow this trend, pushing weight quantization to the limit. Hardware accelerators for BWNs presented up to now have focused on core efficiency, disregarding I/O bandwidth and system-level efficiency that are crucial for deployment of accelerators in ultra-low power devices. We present Hyperdrive: a BWN accelerator dramatically reducing the I/O bandwidth exploiting a novel binary-weight streaming approach, and capable of handling high-resolution images by virtue of its systolic-scalable architecture. We achieve a 5.9 TOp/s/W system-level efficiency (i.e. including I/Os)---2.2x higher than state-of-the-art BNN accelerators, even if our core uses resource-intensive FP16 arithmetic for increased robustness.

READ FULL TEXT

page 1

page 3

research
03/05/2018

Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine

Deep neural networks have achieved impressive results in computer vision...
research
06/17/2016

YodaNN: An Architecture for Ultra-Low Power Binary-Weight CNN Acceleration

Convolutional neural networks (CNNs) have revolutionized the world of co...
research
03/19/2018

Local Binary Pattern Networks

Memory and computation efficient deep learning architec- tures are cruci...
research
05/12/2020

ChewBaccaNN: A Flexible 223 TOPS/W BNN Accelerator

Binary Neural Networks enable smart IoT devices, as they significantly r...
research
04/20/2023

ULEEN: A Novel Architecture for Ultra Low-Energy Edge Neural Networks

The deployment of AI models on low-power, real-time edge devices require...
research
09/26/2022

Going Further With Winograd Convolutions: Tap-Wise Quantization for Efficient Inference on 4x4 Tile

Most of today's computer vision pipelines are built around deep neural n...
research
07/17/2020

Always-On 674uW @ 4GOP/s Error Resilient Binary Neural Networks with Aggressive SRAM Voltage Scaling on a 22nm IoT End-Node

Binary Neural Networks (BNNs) have been shown to be robust to random bit...

Please sign up or login with your details

Forgot password? Click here to reset