CT-Net: Channel Tensorization Network for Video Classification

06/03/2021
by   Kunchang Li, et al.
15

3D convolution is powerful for video classification but often computationally expensive, recent studies mainly focus on decomposing it on spatial-temporal and/or channel dimensions. Unfortunately, most approaches fail to achieve a preferable balance between convolutional efficiency and feature-interaction sufficiency. For this reason, we propose a concise and novel Channel Tensorization Network (CT-Net), by treating the channel dimension of input feature as a multiplication of K sub-dimensions. On one hand, it naturally factorizes convolution in a multiple dimension way, leading to a light computation burden. On the other hand, it can effectively enhance feature interaction from different channels, and progressively enlarge the 3D receptive field of such interaction to boost classification accuracy. Furthermore, we equip our CT-Module with a Tensor Excitation (TE) mechanism. It can learn to exploit spatial, temporal and channel attention in a high-dimensional manner, to improve the cooperative power of all the feature dimensions in our CT-Module. Finally, we flexibly adapt ResNet as our CT-Net. Extensive experiments are conducted on several challenging video benchmarks, e.g., Kinetics-400, Something-Something V1 and V2. Our CT-Net outperforms a number of recent SOTA approaches, in terms of accuracy and/or efficiency. The codes and models will be available on https://github.com/Andy1621/CT-Net.

READ FULL TEXT
research
03/11/2021

ACTION-Net: Multipath Excitation for Action Recognition

Spatial-temporal, channel-wise, and motion patterns are three complement...
research
01/30/2021

SA-Net: Shuffle Attention for Deep Convolutional Neural Networks

Attention mechanisms, which enable a neural network to accurately focus ...
research
10/08/2019

ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks

Channel attention has recently demonstrated to offer great potential in ...
research
08/11/2019

HBONet: Harmonious Bottleneck on Two Orthogonal Dimensions

MobileNets, a class of top-performing convolutional neural network archi...
research
12/15/2021

RA V-Net: Deep learning network for automated liver segmentation

Accurate segmentation of the liver is a prerequisite for the diagnosis o...
research
06/25/2020

SmallBigNet: Integrating Core and Contextual Views for Video Classification

Temporal convolution has been widely used for video classification. Howe...
research
05/12/2021

CT-Net: Complementary Transfering Network for Garment Transfer with Arbitrary Geometric Changes

Garment transfer shows great potential in realistic applications with th...

Please sign up or login with your details

Forgot password? Click here to reset