Dynamic Convolutions: Exploiting Spatial Sparsity for Faster Inference

12/06/2019
by   Thomas Verelst, et al.
0

Modern convolutional neural networks apply the same operations on every pixel in an image. However, not all image regions are equally important. To address this inefficiency, we propose a method to dynamically apply convolutions conditioned on the input image. We introduce a residual block where a small gating branch learns which spatial positions should be evaluated. These discrete gating decisions are trained end-to-end using the Gumbel-Softmax trick, in combination with a sparsity criterion. Our experiments on Food-101, CIFAR and ImageNet show that our method has better focus on the region of interest and better accuracy than existing methods, at a lower computational complexity. Moreover, we provide an efficient CUDA implementation of our dynamic convolutions using a gather-scatter approach, achieving a significant improvement in inference speed on MobileNetV2 and ShuffleNetV2. On human pose estimation, a task that is inherently spatially sparse, the processing speed is increased by 45

READ FULL TEXT

page 7

page 8

research
05/29/2021

FCPose: Fully Convolutional Multi-Person Pose Estimation with Dynamic Instance-Aware Convolutions

We propose a fully convolutional multi-person pose estimation framework ...
research
07/08/2020

Dynamic Group Convolution for Accelerating Convolutional Neural Networks

Replacing normal convolutions with group convolutions can significantly ...
research
11/23/2019

Simple and Lightweight Human Pose Estimation

Recent research on human pose estimation has achieved significant improv...
research
08/22/2023

PoseGraphNet++: Enriching 3D Human Pose with Orientation Estimation

Existing kinematic skeleton-based 3D human pose estimation methods only ...
research
11/24/2020

SegBlocks: Block-Based Dynamic Resolution Networks for Real-Time Segmentation

SegBlocks reduces the computational cost of existing neural networks, by...
research
04/01/2021

Confidence Adaptive Anytime Pixel-Level Recognition

Anytime inference requires a model to make a progression of predictions ...
research
04/13/2021

Lite-HRNet: A Lightweight High-Resolution Network

We present an efficient high-resolution network, Lite-HRNet, for human p...

Please sign up or login with your details

Forgot password? Click here to reset