Parallel Separable 3D Convolution for Video and Volumetric Data Understanding

09/11/2018
by   Felix Gonda, et al.
0

For video and volumetric data understanding, 3D convolution layers are widely used in deep learning, however, at the cost of increasing computation and training time. Recent works seek to replace the 3D convolution layer with convolution blocks, e.g. structured combinations of 2D and 1D convolution layers. In this paper, we propose a novel convolution block, Parallel Separable 3D Convolution (PmSCn), which applies m parallel streams of n 2D and one 1D convolution layers along different dimensions. We first mathematically justify the need of parallel streams (Pm) to replace a single 3D convolution layer through tensor decomposition. Then we jointly replace consecutive 3D convolution layers, common in modern network architectures, with the multiple 2D convolution layers (Cn). Lastly, we empirically show that PmSCn is applicable to different backbone architectures, such as ResNet, DenseNet, and UNet, for different applications, such as video action recognition, MRI brain segmentation, and electron microscopy segmentation. In all three applications, we replace the 3D convolution layers in state-of-the art models with PmSCn and achieve around 14 size and on average.

READ FULL TEXT

page 8

page 9

research
04/25/2020

Depthwise Separable Convolutional ResNet with Squeeze-and-Excitation Blocks for Small-footprint Keyword Spotting

One difficult problem of keyword spotting is how to miniaturize its memo...
research
04/14/2021

Fast Walsh-Hadamard Transform and Smooth-Thresholding Based Binary Layers in Deep Neural Networks

In this paper, we propose a novel layer based on fast Walsh-Hadamard tra...
research
11/27/2017

2D Image Convolution using Three Parallel Programming Models on the Xeon Phi

Image convolution is widely used for sharpening, blurring and edge detec...
research
01/03/2019

Volumetric Convolution: Automatic Representation Learning in Unit Ball

Convolution is an efficient technique to obtain abstract feature represe...
research
11/22/2020

Learnable Sampling 3D Convolution for Video Enhancement and Action Recognition

A key challenge in video enhancement and action recognition is to fuse u...
research
12/25/2020

Inception Convolution with Efficient Dilation Search

Dilation convolution is a critical mutant of standard convolution neural...
research
11/30/2019

Representation Learning on Unit Ball with 3D Roto-Translational Equivariance

Convolution is an integral operation that defines how the shape of one f...

Please sign up or login with your details

Forgot password? Click here to reset