Quantized Convolutional Neural Networks for Mobile Devices

12/21/2015
by   Jiaxiang Wu, et al.
0

Recently, convolutional neural networks (CNN) have demonstrated impressive performance in various computer vision tasks. However, high performance hardware is typically indispensable for the application of CNN models due to the high computation complexity, which prohibits their further extensions. In this paper, we propose an efficient framework, namely Quantized CNN, to simultaneously speed-up the computation and reduce the storage and memory overhead of CNN models. Both filter kernels in convolutional layers and weighting matrices in fully-connected layers are quantized, aiming at minimizing the estimation error of each layer's response. Extensive experiments on the ILSVRC-12 benchmark demonstrate 4 6x speed-up and 15 20x compression with merely one percentage loss of classification accuracy. With our quantized CNN model, even mobile devices can accurately classify images within one second.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/29/2018

Quantized Guided Pruning for Efficient Hardware Implementations of Convolutional Neural Networks

Convolutional Neural Networks (CNNs) are state-of-the-art in numerous co...
research
12/01/2017

Accelerating Convolutional Neural Networks for Continuous Mobile Vision via Cache Reuse

Convolutional Neural Network (CNN) is the state-of-the-art algorithm of ...
research
03/26/2018

Latency and Throughput Characterization of Convolutional Neural Networks for Mobile Computer Vision

We study performance characteristics of convolutional neural networks (C...
research
07/01/2023

MobileViG: Graph-Based Sparse Attention for Mobile Vision Applications

Traditionally, convolutional neural networks (CNN) and vision transforme...
research
05/09/2017

Model Complexity-Accuracy Trade-off for a Convolutional Neural Network

Convolutional Neural Networks(CNN) has had a great success in the recent...
research
04/18/2023

Heterogeneous Integration of In-Memory Analog Computing Architectures with Tensor Processing Units

Tensor processing units (TPUs), specialized hardware accelerators for ma...
research
06/14/2015

Compressing Convolutional Neural Networks

Convolutional neural networks (CNN) are increasingly used in many areas ...

Please sign up or login with your details

Forgot password? Click here to reset