Deep Convolutional Neural Network Inference with Floating-point Weights and Fixed-point Activations

03/08/2017
by   Liangzhen Lai, et al.
0

Deep convolutional neural network (CNN) inference requires significant amount of memory and computation, which limits its deployment on embedded devices. To alleviate these problems to some extent, prior research utilize low precision fixed-point numbers to represent the CNN weights and activations. However, the minimum required data precision of fixed-point weights varies across different networks and also across different layers of the same network. In this work, we propose using floating-point numbers for representing the weights and fixed-point numbers for representing the activations. We show that using floating-point representation for weights is more efficient than fixed-point representation for the same bit-width and demonstrate it on popular large-scale CNNs such as AlexNet, SqueezeNet, GoogLeNet and VGG-16. We also show that such a representation scheme enables compact hardware multiply-and-accumulate (MAC) unit design. Experimental results show that the proposed scheme reduces the weight storage by up to 36 up to 50

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/22/2018

Deep Learning Inference on Embedded Devices: Fixed-Point vs Posit

Performing the inference step of deep learning in resource constrained e...
research
02/04/2022

Fixed-Point Code Synthesis For Neural Networks

Over the last few years, neural networks have started penetrating safety...
research
08/30/2020

Optimal Quantization for Batch Normalization in Neural Network Deployments and Beyond

Quantized Neural Networks (QNNs) use low bit-width fixed-point numbers f...
research
03/29/2021

Deep Compression for PyTorch Model Deployment on Microcontrollers

Neural network deployment on low-cost embedded systems, hence on microco...
research
04/22/2015

Rounding Methods for Neural Networks with Low Resolution Synaptic Weights

Neural network algorithms simulated on standard computing platforms typi...
research
01/29/2018

TernaryNet: Faster Deep Model Inference without GPUs for Medical 3D Segmentation using Sparse and Binary Convolutions

Deep convolutional neural networks (DCNN) are currently ubiquitous in me...
research
08/15/2018

DNN Feature Map Compression using Learned Representation over GF(2)

In this paper, we introduce a method to compress intermediate feature ma...

Please sign up or login with your details

Forgot password? Click here to reset