CAKES: Channel-wise Automatic KErnel Shrinking for Efficient 3D Network

03/28/2020
by   Qihang Yu, et al.
8

3D Convolution Neural Networks (CNNs) have been widely applied to 3D scene understanding, such as video analysis and volumetric image recognition. However, 3D networks can easily lead to over-parameterization which incurs expensive computation cost. In this paper, we propose Channel-wise Automatic KErnel Shrinking (CAKES), to enable efficient 3D learning by shrinking standard 3D convolutions into a set of economic operations (e.g., 1D, 2D convolutions). Unlike previous methods, our proposed CAKES performs channel-wise kernel shrinkage, which enjoys the following benefits: 1) encouraging operations deployed in every layer to be heterogeneous, so that they can extract diverse and complementary information to benefit the learning process; and 2) allowing for an efficient and flexible replacement design, which can be generalized to both spatial-temporal and volumetric data. Together with a neural architecture search framework, by applying CAKES to 3D C2FNAS and ResNet50, we achieve the state-of-the-art performance with much fewer parameters and computational costs on both 3D medical imaging segmentation and video action recognition.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/18/2019

Class Feature Pyramids for Video Explanation

Deep convolutional networks are widely used in video action recognition....
research
06/06/2019

V-NAS: Neural Architecture Search for Volumetric Medical Image Segmentation

Deep learning algorithms, in particular 2D and 3D fully convolutional ne...
research
08/05/2022

Blockwise Temporal-Spatial Pathway Network

Algorithms for video action recognition should consider not only spatial...
research
02/25/2020

FPConv: Learning Local Flattening for Point Convolution

We introduce FPConv, a novel surface-style convolution operator designed...
research
04/23/2023

HKNAS: Classification of Hyperspectral Imagery Based on Hyper Kernel Neural Architecture Search

Recent neural architecture search (NAS) based approaches have made great...
research
02/04/2019

Saliency Tubes: Visual Explanations for Spatio-Temporal Convolutions

Deep learning approaches have been established as the main methodology f...
research
07/15/2022

pathGCN: Learning General Graph Spatial Operators from Paths

Graph Convolutional Networks (GCNs), similarly to Convolutional Neural N...

Please sign up or login with your details

Forgot password? Click here to reset