Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition

06/20/2020
by   Ionut Cosmin Duta, et al.
8

This work introduces pyramidal convolution (PyConv), which is capable of processing the input at multiple filter scales. PyConv contains a pyramid of kernels, where each level involves different types of filters with varying size and depth, which are able to capture different levels of details in the scene. On top of these improved recognition capabilities, PyConv is also efficient and, with our formulation, it does not increase the computational cost and parameters compared to standard convolution. Moreover, it is very flexible and extensible, providing a large space of potential network architectures for different applications. PyConv has the potential to impact nearly every computer vision task and, in this work, we present different architectures based on PyConv for four main tasks on visual recognition: image classification, video action classification/recognition, object detection and semantic image segmentation/parsing. Our approach shows significant improvements over all these core tasks in comparison with the baselines. For instance, on image recognition, our 50-layers network outperforms in terms of recognition performance on ImageNet dataset its counterpart baseline ResNet with 152 layers, while having 2.39 times less parameters, 2.52 times lower computational complexity and more than 3 times less layers. On image segmentation, our novel framework sets a new state-of-the-art on the challenging ADE20K benchmark for scene parsing. Code is available at: https://github.com/iduta/pyconv

READ FULL TEXT

page 2

page 13

page 14

research
10/15/2020

HS-ResNet: Hierarchical-Split Block on Convolutional Neural Network

This paper addresses representational block named Hierarchical-Split Blo...
research
08/17/2021

Contextual Convolutional Neural Networks

We propose contextual convolution (CoConv) for visual recognition. CoCon...
research
04/25/2019

Local Relation Networks for Image Recognition

The convolution layer has been the dominant feature extractor in compute...
research
03/20/2022

TVConv: Efficient Translation Variant Convolution for Layout-aware Visual Processing

As convolution has empowered many smart applications, dynamic convolutio...
research
01/09/2020

Multi-Scale Weight Sharing Network for Image Recognition

In this paper, we explore the idea of weight sharing over multiple scale...
research
08/12/2021

MicroNet: Improving Image Recognition with Extremely Low FLOPs

This paper aims at addressing the problem of substantial performance deg...
research
09/11/2018

Searching for Efficient Multi-Scale Architectures for Dense Image Prediction

The design of neural network architectures is an important component for...

Please sign up or login with your details

Forgot password? Click here to reset