Hardware-Software Codesign of Accurate, Multiplier-free Deep Neural Networks

05/11/2017
by   Hokchhay Tann, et al.
0

While Deep Neural Networks (DNNs) push the state-of-the-art in many machine learning applications, they often require millions of expensive floating-point operations for each input classification. This computation overhead limits the applicability of DNNs to low-power, embedded platforms and incurs high cost in data centers. This motivates recent interests in designing low-power, low-latency DNNs based on fixed-point, ternary, or even binary data precision. While recent works in this area offer promising results, they often lead to large accuracy drops when compared to the floating-point networks. We propose a novel approach to map floating-point based DNNs to 8-bit dynamic fixed-point networks with integer power-of-two weights with no change in network architecture. Our dynamic fixed-point DNNs allow different radix points between layers. During inference, power-of-two weights allow multiplications to be replaced with arithmetic shifts, while the 8-bit fixed-point representation simplifies both the buffer and adder design. In addition, we propose a hardware accelerator design to achieve low-power, low-latency inference with insignificant degradation in accuracy. Using our custom accelerator design with the CIFAR-10 and ImageNet datasets, we show that our method achieves significant power and energy savings while increasing the classification accuracy.

READ FULL TEXT
research
12/05/2018

Deep Positron: A Deep Neural Network Using the Posit Number System

The recent surge of interest in Deep Neural Networks (DNNs) has led to i...
research
03/12/2020

Proposal of a Takagi-Sugeno Fuzzy-PI Controller Hardware

This work proposes dedicated hardware for an intelligent control system ...
research
06/03/2018

Deploying Customized Data Representation and Approximate Computing in Machine Learning Applications

Major advancements in building general-purpose and customized hardware h...
research
07/07/2023

BlendNet: Design and Optimization of a Neural Network-Based Inference Engine Blending Binary and Fixed-Point Convolutions

This paper presents BlendNet, a neural network architecture employing a ...
research
03/23/2022

Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models

Increasingly larger and better Transformer models keep advancing state-o...
research
12/12/2016

Understanding the Impact of Precision Quantization on the Accuracy and Energy of Neural Networks

Deep neural networks are gaining in popularity as they are used to gener...
research
11/29/2017

Transfer Learning with Binary Neural Networks

Previous work has shown that it is possible to train deep neural network...

Please sign up or login with your details

Forgot password? Click here to reset