Fully Dynamic Inference with Deep Neural Networks

07/29/2020
by   Wenhan Xia, et al.
0

Modern deep neural networks are powerful and widely applicable models that extract task-relevant information through multi-level abstraction. Their cross-domain success, however, is often achieved at the expense of computational cost, high memory bandwidth, and long inference latency, which prevents their deployment in resource-constrained and time-sensitive scenarios, such as edge-side inference and self-driving cars. While recently developed methods for creating efficient deep neural networks are making their real-world deployment more feasible by reducing model size, they do not fully exploit input properties on a per-instance basis to maximize computational efficiency and task accuracy. In particular, most existing methods typically use a one-size-fits-all approach that identically processes all inputs. Motivated by the fact that different images require different feature embeddings to be accurately classified, we propose a fully dynamic paradigm that imparts deep convolutional neural networks with hierarchical inference dynamics at the level of layers and individual convolutional filters/channels. Two compact networks, called Layer-Net (L-Net) and Channel-Net (C-Net), predict on a per-instance basis which layers or filters/channels are redundant and therefore should be skipped. L-Net and C-Net also learn how to scale retained computation outputs to maximize task accuracy. By integrating L-Net and C-Net into a joint design framework, called LC-Net, we consistently outperform state-of-the-art dynamic frameworks with respect to both efficiency and classification accuracy. On the CIFAR-10 dataset, LC-Net results in up to 11.9× fewer floating-point operations (FLOPs) and up to 3.3 inference methods. On the ImageNet dataset, LC-Net achieves up to 1.4× fewer FLOPs and up to 4.6

READ FULL TEXT
research
03/16/2016

XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks

We propose two efficient approximations to standard convolutional neural...
research
10/11/2020

Glance and Focus: a Dynamic Approach to Reducing Spatial Redundancy in Image Classification

The accuracy of deep convolutional neural networks (CNNs) generally impr...
research
10/17/2018

Pruning Deep Neural Networks using Partial Least Squares

To handle the high computational cost in deep convolutional networks, re...
research
11/10/2022

Cherry Hypothesis: Identifying the Cherry on the Cake for Dynamic Networks

Dynamic networks have been extensively explored as they can considerably...
research
01/10/2020

Efficient Memory Management for Deep Neural Net Inference

While deep neural net inference was considered a task for servers only, ...
research
02/13/2023

Stitchable Neural Networks

The public model zoo containing enormous powerful pretrained model famil...
research
09/28/2017

Improving Efficiency in Convolutional Neural Network with Multilinear Filters

The excellent performance of deep neural networks has enabled us to solv...

Please sign up or login with your details

Forgot password? Click here to reset