DAC: Data-free Automatic Acceleration of Convolutional Networks

12/20/2018
by   Xin Li, et al.
0

Deploying a deep learning model on mobile/IoT devices is a challenging task. The difficulty lies in the trade-off between computation speed and accuracy. A complex deep learning model with high accuracy runs slowly on resource-limited devices, while a light-weight model that runs much faster loses accuracy. In this paper, we propose a novel decomposition method, namely DAC, that is capable of factorizing an ordinary convolutional layer into two layers with much fewer parameters. DAC computes the corresponding weights for the newly generated layers directly from the weights of the original convolutional layer. Thus, no training (or fine-tuning) or any data is needed. The experimental results show that DAC reduces a large number of floating-point operations (FLOPs) while maintaining high accuracy of a pre-trained model. If 2 drop is acceptable, DAC saves 53 ImageNet dataset, 29 dataset, and 46 COCO dataset. Compared to other existing decomposition methods, DAC achieves better performance.

READ FULL TEXT

page 7

page 8

research
12/24/2022

Pruning On-the-Fly: A Recoverable Pruning Method without Fine-tuning

Most existing pruning works are resource-intensive, requiring retraining...
research
02/15/2023

TFormer: A Transmission-Friendly ViT Model for IoT Devices

Deploying high-performance vision transformer (ViT) models on ubiquitous...
research
08/16/2020

KutralNet: A Portable Deep Learning Model for Fire Recognition

Most of the automatic fire alarm systems detect the fire presence throug...
research
05/24/2019

Light-Weight RetinaNet for Object Detection

Object detection has gained great progress driven by the development of ...
research
03/22/2021

Channel Scaling: A Scale-and-Select Approach for Transfer Learning

Transfer learning with pre-trained neural networks is a common strategy ...
research
09/10/2019

Accelerating Training using Tensor Decomposition

Tensor decomposition is one of the well-known approaches to reduce the l...
research
10/15/2022

Variant Parallelism: Lightweight Deep Convolutional Models for Distributed Inference on IoT Devices

Two major techniques are commonly used to meet real-time inference limit...

Please sign up or login with your details

Forgot password? Click here to reset