Temporal Segment Networks: Towards Good Practices for Deep Action Recognition

08/02/2016
by   Limin Wang, et al.
0

Deep convolutional networks have achieved great success for visual recognition in still images. However, for action recognition in videos, the advantage over traditional methods is not so evident. This paper aims to discover the principles to design effective ConvNet architectures for action recognition in videos and learn these models given limited training samples. Our first contribution is temporal segment network (TSN), a novel framework for video-based action recognition. which is based on the idea of long-range temporal structure modeling. It combines a sparse temporal sampling strategy and video-level supervision to enable efficient and effective learning using the whole action video. The other contribution is our study on a series of good practices in learning ConvNets on video data with the help of temporal segment network. Our approach obtains the state-the-of-art performance on the datasets of HMDB51 ( 69.4% ) and UCF101 ( 94.2% ). We also visualize the learned ConvNet models, which qualitatively demonstrates the effectiveness of temporal segment network and the proposed good practices.

READ FULL TEXT

page 5

page 7

page 14

research
05/08/2017

Temporal Segment Networks for Action Recognition in Videos

Deep convolutional networks have achieved great success for image recogn...
research
07/08/2015

Towards Good Practices for Very Deep Two-Stream ConvNets

Deep convolutional networks have achieved great success for object recog...
research
12/15/2021

Temporal Shuffling for Defending Deep Action Recognition Models against Adversarial Attacks

Recently, video-based action recognition methods using convolutional neu...
research
06/19/2019

An Action Recognition network for specific target based on rMC and RPN

The traditional methods of action recognition are not specific for the o...
research
08/05/2016

Fusing Deep Convolutional Networks for Large Scale Visual Concept Classification

Deep learning architectures are showing great promise in various compute...
research
11/25/2018

Learning Conditional Random Fields with Augmented Observations for Partially Observed Action Recognition

This paper aims at recognizing partially observed human actions in video...
research
06/11/2018

Massively Parallel Video Networks

We introduce a class of causal video understanding models that aims to i...

Please sign up or login with your details

Forgot password? Click here to reset