Training convolutional neural networks with cheap convolutions and online distillation

09/28/2019
by   Jiao Xie, et al.
0

The large memory and computation consumption in convolutional neural networks (CNNs) has been one of the main barriers for deploying them on resource-limited systems. To this end, most cheap convolutions (e.g., group convolution, depth-wise convolution, and shift convolution) have recently been used for memory and computation reduction but with the specific architecture designing. Furthermore, it results in a low discriminability of the compressed networks by directly replacing the standard convolution with these cheap ones. In this paper, we propose to use knowledge distillation to improve the performance of the compact student networks with cheap convolutions. In our case, the teacher is a network with the standard convolution, while the student is a simple transformation of the teacher architecture without complicated redesigning. In particular, we propose a novel online distillation method, which online constructs the teacher network without pre-training and conducts mutual learning between the teacher and student network, to improve the performance of the student model. Extensive experiments demonstrate that the proposed approach achieves superior performance to simultaneously reduce memory and computation overhead of cutting-edge CNNs on different datasets, including CIFAR-10/100 and ImageNet ILSVRC 2012, compared to the state-of-the-art CNN compression and acceleration methods. The codes are publicly available at https://github.com/EthanZhangYC/OD-cheap-convolution.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2021

Revisiting Knowledge Distillation: An Inheritance and Exploration Framework

Knowledge Distillation (KD) is a popular technique to transfer knowledge...
research
04/15/2021

AsymmNet: Towards ultralight convolution neural networks using asymmetrical bottlenecks

Deep convolutional neural networks (CNN) have achieved astonishing resul...
research
12/17/2019

Joint Architecture and Knowledge Distillation in Convolutional Neural Network for Offline Handwritten Chinese Text Recognition

The technique of distillation helps transform cumbersome neural network ...
research
07/08/2020

Dynamic Group Convolution for Accelerating Convolutional Neural Networks

Replacing normal convolutions with group convolutions can significantly ...
research
05/17/2019

Be Your Own Teacher: Improve the Performance of Convolutional Neural Networks via Self Distillation

Convolutional neural networks have been widely deployed in various appli...
research
02/26/2021

Knowledge Distillation Circumvents Nonlinearity for Optical Convolutional Neural Networks

In recent years, Convolutional Neural Networks (CNNs) have enabled ubiqu...
research
10/28/2021

Generalized Depthwise-Separable Convolutions for Adversarially Robust and Efficient Neural Networks

Despite their tremendous successes, convolutional neural networks (CNNs)...

Please sign up or login with your details

Forgot password? Click here to reset