CNN Acceleration by Low-rank Approximation with Quantized Factors

06/16/2020
by   Nikolay Kozyrskiy, et al.
0

The modern convolutional neural networks although achieve great results in solving complex computer vision tasks still cannot be effectively used in mobile and embedded devices due to the strict requirements for computational complexity, memory and power consumption. The CNNs have to be compressed and accelerated before deployment. In order to solve this problem the novel approach combining two known methods, low-rank tensor approximation in Tucker format and quantization of weights and feature maps (activations), is proposed. The greedy one-step and multi-step algorithms for the task of multilinear rank selection are proposed. The approach for quality restoration after applying Tucker decomposition and quantization is developed. The efficiency of our method is demonstrated for ResNet18 and ResNet34 on CIFAR-10, CIFAR-100 and Imagenet classification tasks. As a result of comparative analysis performed for other methods for compression and acceleration our approach showed its promising features.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/02/2021

Low-Rank+Sparse Tensor Compression for Neural Networks

Low-rank tensor compression has been proposed as a promising approach to...
research
03/23/2018

Iterative Low-Rank Approximation for CNN Compression

Deep convolutional neural networks contain tens of millions of parameter...
research
12/10/2018

Accelerating Convolutional Neural Networks via Activation Map Compression

The deep learning revolution brought us an extensive array of neural net...
research
11/19/2015

Convolutional neural networks with low-rank regularization

Large CNNs have delivered impressive performance in various computer vis...
research
08/08/2023

Quantization Aware Factorization for Deep Neural Network Compression

Tensor decomposition of convolutional and fully-connected layers is an e...
research
10/19/2018

CNN inference acceleration using dictionary of centroids

It is well known that multiplication operations in convolutional layers ...
research
11/10/2021

An Underexplored Dilemma between Confidence and Calibration in Quantized Neural Networks

Modern convolutional neural networks (CNNs) are known to be overconfiden...

Please sign up or login with your details

Forgot password? Click here to reset