TAN: Temporal Aggregation Network for Dense Multi-label Action Recognition

12/14/2018
by   Xiyang Dai, et al.
0

We present Temporal Aggregation Network (TAN) which decomposes 3D convolutions into spatial and temporal aggregation blocks. By stacking spatial and temporal convolutions repeatedly, TAN forms a deep hierarchical representation for capturing spatio-temporal information in videos. Since we do not apply 3D convolutions in each layer but only apply temporal aggregation blocks once after each spatial downsampling layer in the network, we significantly reduce the model complexity. The use of dilated convolutions at different resolutions of the network helps in aggregating multi-scale spatio-temporal information efficiently. Experiments show that our model is well suited for dense multi-label action recognition, which is a challenging sub-topic of action recognition that requires predicting multiple action labels in each frame. We outperform state-of-the-art methods by 5 Charades and Multi-THUMOS dataset respectively.

READ FULL TEXT

page 1

page 7

page 8

research
03/18/2020

STH: Spatio-Temporal Hybrid Convolution for Efficient Action Recognition

Effective and Efficient spatio-temporal modeling is essential for action...
research
08/27/2016

Spatio-temporal Aware Non-negative Component Representation for Action Recognition

This paper presents a novel mid-level representation for action recognit...
research
05/29/2019

Hierarchical Feature Aggregation Networks for Video Action Recognition

Most action recognition methods base on a) a late aggregation of frame l...
research
11/08/2020

Right on Time: Multi-Temporal Convolutions for Human Action Recognition in Videos

The variations in the temporal performance of human actions observed in ...
research
07/22/2021

EAN: Event Adaptive Network for Enhanced Action Recognition

Efficiently modeling spatial-temporal information in videos is crucial f...
research
10/16/2020

Toward Accurate Person-level Action Recognition in Videos of Crowded Scenes

Detecting and recognizing human action in videos with crowded scenes is ...
research
11/17/2020

Semi-Supervised Few-Shot Atomic Action Recognition

Despite excellent progress has been made, the performance on action reco...

Please sign up or login with your details

Forgot password? Click here to reset