DARC: Differentiable ARchitecture Compression

05/20/2019
by   Shashank Singh, et al.
0

In many learning situations, resources at inference time are significantly more constrained than resources at training time. This paper studies a general paradigm, called Differentiable ARchitecture Compression (DARC), that combines model compression and architecture search to learn models that are resource-efficient at inference time. Given a resource-intensive base architecture, DARC utilizes the training data to learn which sub-components can be replaced by cheaper alternatives. The high-level technique can be applied to any neural architecture, and we report experiments on state-of-the-art convolutional neural networks for image classification. For a WideResNet with 97.2% accuracy on CIFAR-10, we improve single-sample inference speed by 2.28× and memory footprint by 5.64×, with no accuracy loss. For a ResNet with 79.15% Top1 accuracy on ImageNet, we improve batch inference speed by 1.29× and memory footprint by 3.57× with 1% accuracy loss. We also give theoretical Rademacher complexity bounds in simplified cases, showing how DARC avoids overfitting despite over-parameterization.

READ FULL TEXT
research
10/27/2020

μNAS: Constrained Neural Architecture Search for Microcontrollers

IoT devices are powered by microcontroller units (MCUs) which are extrem...
research
09/23/2022

Tiered Pruning for Efficient Differentialble Inference-Aware Neural Architecture Search

We propose three novel pruning techniques to improve the cost and result...
research
03/23/2022

U-Boost NAS: Utilization-Boosted Differentiable Neural Architecture Search

Optimizing resource utilization in target platforms is key to achieving ...
research
03/27/2019

Network Slimming by Slimmable Networks: Towards One-Shot Architecture Search for Channel Numbers

We study how to set channel numbers in a neural network to achieve bette...
research
12/18/2020

Resource-efficient DNNs for Keyword Spotting using Neural Architecture Search and Quantization

This paper introduces neural architecture search (NAS) for the automatic...
research
03/07/2022

Dynamic ConvNets on Tiny Devices via Nested Sparsity

This work introduces a new training and compression pipeline to build Ne...
research
03/18/2022

LeHDC: Learning-Based Hyperdimensional Computing Classifier

Thanks to the tiny storage and efficient execution, hyperdimensional Com...

Please sign up or login with your details

Forgot password? Click here to reset