Inception Convolution with Efficient Dilation Search

12/25/2020
by   Jie Liu, et al.
0

Dilation convolution is a critical mutant of standard convolution neural network to control effective receptive fields and handle large scale variance of objects without introducing additional computation. However, fitting the effective reception field to data with dilated convolution is less discussed in the literature. To fully explore its potentials, we proposed a new mutant of dilated convolution, namely inception (dilated) convolution where the convolutions have independent dilation among different axes, channels and layers. To explore a practical method for fitting the complex inception convolution to the data, a simple while effective dilation search algorithm(EDO) based on statistical optimization is developed. The search method operates in a zero-cost manner which is extremely fast to apply on large scale datasets. Empirical results reveal that our method obtains consistent performance gains in an extensive range of benchmarks. For instance, by simply replace the 3 x 3 standard convolutions in ResNet-50 backbone with inception convolution, we improve the mAP of Faster-RCNN on MS-COCO from 36.4 Furthermore, using the same replacement in ResNet-101 backbone, we achieve a huge improvement over AP score from 60.2 bottom up human pose estimation.

READ FULL TEXT

page 3

page 13

research
10/07/2016

Xception: Deep Learning with Depthwise Separable Convolutions

We present an interpretation of Inception modules in convolutional neura...
research
01/12/2023

Towards High Performance One-Stage Human Pose Estimation

Making top-down human pose estimation method present both good performan...
research
06/26/2019

Accelerating Large-Kernel Convolution Using Summed-Area Tables

Expanding the receptive field to capture large-scale context is key to o...
research
02/09/2023

Gaussian Mask Convolution for Convolutional Neural Networks

Square convolution is a default unit in convolutional neural networks as...
research
09/11/2018

Parallel Separable 3D Convolution for Video and Volumetric Data Understanding

For video and volumetric data understanding, 3D convolution layers are w...
research
05/05/2017

Knowledge-Guided Deep Fractal Neural Networks for Human Pose Estimation

Human pose estimation using deep neural networks aims to map input image...
research
07/08/2022

VidConv: A modernized 2D ConvNet for Efficient Video Recognition

Since being introduced in 2020, Vision Transformers (ViT) has been stead...

Please sign up or login with your details

Forgot password? Click here to reset