Compact Multi-level Sparse Neural Networks with Input Independent Dynamic Rerouting

12/21/2021
by   Minghai Qin, et al.
0

Deep neural networks (DNNs) have shown to provide superb performance in many real life applications, but their large computation cost and storage requirement have prevented them from being deployed to many edge and internet-of-things (IoT) devices. Sparse deep neural networks, whose majority weight parameters are zeros, can substantially reduce the computation complexity and memory consumption of the models. In real-use scenarios, devices may suffer from large fluctuations of the available computation and memory resources under different environment, and the quality of service (QoS) is difficult to maintain due to the long tail inferences with large latency. Facing the real-life challenges, we propose to train a sparse model that supports multiple sparse levels. That is, a hierarchical structure of weights are satisfied such that the locations and the values of the non-zero parameters of the more-sparse sub-model area subset of the less-sparse sub-model. In this way, one can dynamically select the appropriate sparsity level during inference, while the storage cost is capped by the least sparse sub-model. We have verified our methodologies on a variety of DNN models and tasks, including the ResNet-50, PointNet++, GNMT, and graph attention networks. We obtain sparse sub-models with an average of 13.38 accuracies are as good as their dense counterparts. More-sparse sub-models with 5.38 can be obtained with only 3.25

READ FULL TEXT
research
04/18/2020

Efficient Synthesis of Compact Deep Neural Networks

Deep neural networks (DNNs) have been deployed in myriad machine learnin...
research
04/24/2020

Computation on Sparse Neural Networks: an Inspiration for Future Hardware

Neural network models are widely used in solving many challenging proble...
research
03/20/2021

Compacting Deep Neural Networks for Internet of Things: Methods and Applications

Deep Neural Networks (DNNs) have shown great success in completing compl...
research
07/11/2022

Sparsifying Binary Networks

Binary neural networks (BNNs) have demonstrated their ability to solve c...
research
02/22/2023

DISCO: Distributed Inference with Sparse Communications

Deep neural networks (DNNs) have great potential to solve many real-worl...
research
07/01/2022

DRESS: Dynamic REal-time Sparse Subnets

The limited and dynamically varied resources on edge devices motivate us...
research
03/07/2022

Dynamic ConvNets on Tiny Devices via Nested Sparsity

This work introduces a new training and compression pipeline to build Ne...

Please sign up or login with your details

Forgot password? Click here to reset