Joint Neural Architecture Search and Quantization

11/23/2018
by   Yukang Chen, et al.
0

Designing neural architectures is a fundamental step in deep learning applications. As a partner technique, model compression on neural networks has been widely investigated to gear the needs that the deep learning algorithms could be run with the limited computation resources on mobile devices. Currently, both the tasks of architecture design and model compression require expertise tricks and tedious trials. In this paper, we integrate these two tasks into one unified framework, which enables the joint architecture search with quantization (compression) policies for neural networks. This method is named JASQ. Here our goal is to automatically find a compact neural network model with high performance that is suitable for mobile devices. Technically, a multi-objective evolutionary search algorithm is introduced to search the models under the balance between model size and performance accuracy. In experiments, we find that our approach outperforms the methods that search only for architectures or only for quantization policies. 1) Specifically, given existing networks, our approach can provide them with learning-based quantization policies, and outperforms their 2 bits, 4 bits, 8 bits, and 16 bits counterparts. It can yield higher accuracies than the float models, for example, over 1.02 balance between model size and performance accuracy, two models are obtained with joint search of architectures and quantization policies: a high-accuracy model and a small model, JASQNet and JASQNet-Small that achieves 2.97 rate with 0.9 MB on CIFAR-10.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/15/2022

Towards Hardware-Specific Automatic Compression of Neural Networks

Compressing neural network architectures is important to allow the deplo...
research
10/20/2018

Differentiable Fine-grained Quantization for Deep Neural Network Compression

Neural networks have shown great performance in cognitive tasks. When de...
research
02/15/2019

AutoQB: AutoML for Network Quantization and Binarization on Mobile Devices

In this paper, we propose a hierarchical deep reinforcement learning (DR...
research
07/14/2020

Sparse CNN Architecture Search (SCAS)

Advent of deep neural networks has revolutionized Computer Vision. Howev...
research
01/04/2019

Multi-Objective Reinforced Evolution in Mobile Neural Architecture Search

Fabricating neural models for a wide range of mobile devices demands for...
research
08/02/2021

Multi-objective Recurrent Neural Networks Optimization for the Edge – a Quantization-based Approach

The compression of deep learning models is of fundamental importance in ...
research
04/21/2020

A Data and Compute Efficient Design for Limited-Resources Deep Learning

Thanks to their improved data efficiency, equivariant neural networks ha...

Please sign up or login with your details

Forgot password? Click here to reset