FLightNNs: Lightweight Quantized Deep Neural Networks for Fast and Accurate Inference

04/05/2019
by   Ruizhou Ding, et al.
0

To improve the throughput and energy efficiency of Deep Neural Networks (DNNs) on customized hardware, lightweight neural networks constrain the weights of DNNs to be a limited combination (denoted as k∈{1,2}) of powers of 2. In such networks, the multiply-accumulate operation can be replaced with a single shift operation, or two shifts and an add operation. To provide even more design flexibility, the k for each convolutional filter can be optimally chosen instead of being fixed for every filter. In this paper, we formulate the selection of k to be differentiable, and describe model training for determining k-based weights on a per-filter basis. Over 46 FPGA-design experiments involving eight configurations and four data sets reveal that lightweight neural networks with a flexible k value (dubbed FLightNNs) fully utilize the hardware resources on Field Programmable Gate Arrays (FPGAs), our experimental results show that FLightNNs can achieve 2× speedup when compared to lightweight NNs with k=2, with only 0.1% accuracy degradation. Compared to a 4-bit fixed-point quantization, FLightNNs achieve higher accuracy and up to 2× inference speedup, due to their lightweight shift operations. In addition, our experiments also demonstrate that FLightNNs can achieve higher computational energy efficiency for ASIC implementation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/18/2018

Towards Ultra-High Performance and Energy Efficiency of Deep Learning Systems: An Algorithm-Hardware Co-Optimization Framework

Hardware accelerations of deep learning systems have been extensively in...
research
09/15/2023

A Precision-Scalable RISC-V DNN Processor with On-Device Learning Capability at the Extreme Edge

Extreme edge platforms, such as in-vehicle smart devices, require effici...
research
02/27/2018

L1-Norm Batch Normalization for Efficient Training of Deep Neural Networks

Batch Normalization (BN) has been proven to be quite effective at accele...
research
10/24/2020

ShiftAddNet: A Hardware-Inspired Deep Network

Multiplication (e.g., convolution) is arguably a cornerstone of modern d...
research
05/08/2018

Towards Accurate and High-Speed Spiking Neuromorphic Systems with Data Quantization-Aware Deep Networks

Deep Neural Networks (DNNs) have gained immense success in cognitive app...
research
08/03/2021

Hardware-aware Design of Multiplierless Second-Order IIR Filters with Minimum Adders

In this work, we optimally solve the problem of multiplierless design of...
research
05/23/2022

Online Hybrid Lightweight Representations Learning: Its Application to Visual Tracking

This paper presents a novel hybrid representation learning framework for...

Please sign up or login with your details

Forgot password? Click here to reset