Latency-aware Spatial-wise Dynamic Networks

by   Yizeng Han, et al.

Spatial-wise dynamic convolution has become a promising approach to improving the inference efficiency of deep networks. By allocating more computation to the most informative pixels, such an adaptive inference paradigm reduces the spatial redundancy in image features and saves a considerable amount of unnecessary computation. However, the theoretical efficiency achieved by previous methods can hardly translate into a realistic speedup, especially on the multi-core processors (e.g. GPUs). The key challenge is that the existing literature has only focused on designing algorithms with minimal computation, ignoring the fact that the practical latency can also be influenced by scheduling strategies and hardware properties. To bridge the gap between theoretical computation and practical efficiency, we propose a latency-aware spatial-wise dynamic network (LASNet), which performs coarse-grained spatially adaptive inference under the guidance of a novel latency prediction model. The latency prediction model can efficiently estimate the inference latency of dynamic networks by simultaneously considering algorithms, scheduling strategies, and hardware properties. We use the latency predictor to guide both the algorithm design and the scheduling optimization on various hardware platforms. Experiments on image classification, object detection and instance segmentation demonstrate that the proposed framework significantly improves the practical inference efficiency of deep networks. For example, the average latency of a ResNet-101 on the ImageNet validation set could be reduced by 36 and 46 TX2 GPU) respectively without sacrificing the accuracy. Code is available at


page 4

page 9

page 16


Latency-aware Unified Dynamic Networks for Efficient Image Recognition

Dynamic computation has emerged as a promising avenue to enhance the inf...

Glance and Focus: a Dynamic Approach to Reducing Spatial Redundancy in Image Classification

The accuracy of deep convolutional neural networks (CNNs) generally impr...

Dynamic Perceiver for Efficient Visual Recognition

Early exiting has become a promising approach to improving the inference...

Glance and Focus Networks for Dynamic Visual Recognition

Spatial redundancy widely exists in visual recognition tasks, i.e., disc...

Learning to Upsample by Learning to Sample

We present DySample, an ultra-lightweight and effective dynamic upsample...

Learning to Weight Samples for Dynamic Early-exiting Networks

Early exiting is an effective paradigm for improving the inference effic...

TVConv: Efficient Translation Variant Convolution for Layout-aware Visual Processing

As convolution has empowered many smart applications, dynamic convolutio...

Please sign up or login with your details

Forgot password? Click here to reset