Video Classification with Channel-Separated Convolutional Networks

04/04/2019
by   Du Tran, et al.
0

Group convolution has been shown to offer great computational savings in various 2D convolutional architectures for image classification. It is natural to ask: 1) if group convolution can help to alleviate the high computational cost of video classification networks; 2) what factors matter the most in 3D group convolutional networks; and 3) what are good computation/accuracy trade-offs with 3D group convolutional networks. This paper studies different effects of group convolution in 3D convolutional networks for video classification. We empirically demonstrate that the amount of channel interactions plays an important role in the accuracy of group convolutional networks. Our experiments suggest two main findings. First, it is a good practice to factorize 3D convolutions by separating channel interactions and spatiotemporal interactions as this leads to improved accuracy and lower computational cost. Second, 3D channel-separated convolutions provide a form of regularization, yielding lower training accuracy but higher test accuracy compared to 3D convolutions. These two empirical findings lead us to design an architecture -- Channel-Separated Convolutional Network (CSN) -- which is simple, efficient, yet accurate. On Kinetics and Sports1M, our CSNs significantly outperform state-of-the-art models while being 11-times more efficient.

READ FULL TEXT

page 1

page 2

page 3

page 5

page 6

page 7

page 8

page 9

research
11/19/2018

Building Efficient Deep Neural Networks with Unitary Group Convolutions

We propose unitary group convolutions (UGConvs), a building block for CN...
research
02/07/2020

Attentive Group Equivariant Convolutional Networks

Although group convolutional networks are able to learn powerful represe...
research
12/02/2015

Rethinking the Inception Architecture for Computer Vision

Convolutional networks are at the core of most state-of-the-art computer...
research
02/27/2020

XSepConv: Extremely Separated Convolution

Depthwise convolution has gradually become an indispensable operation fo...
research
12/02/2014

Learning Spatiotemporal Features with 3D Convolutional Networks

We propose a simple, yet effective approach for spatiotemporal feature l...
research
03/23/2018

What Do We Understand About Convolutional Networks?

This document will review the most prominent proposals using multilayer ...
research
11/03/2020

Similarity-Based Clustering for Enhancing Image Classification Architectures

Convolutional networks are at the center of best in class computer visio...

Please sign up or login with your details

Forgot password? Click here to reset