Order-aware Convolutional Pooling for Video Based Action Recognition

01/31/2016
by   Peng Wang, et al.
0

Most video based action recognition approaches create the video-level representation by temporally pooling the features extracted at each frame. The pooling methods that they adopt, however, usually completely or partially neglect the dynamic information contained in the temporal domain, which may undermine the discriminative power of the resulting video representation since the video sequence order could unveil the evolution of a specific event or action. To overcome this drawback and explore the importance of incorporating the temporal order information, in this paper we propose a novel temporal pooling approach to aggregate the frame-level features. Inspired by the capacity of Convolutional Neural Networks (CNN) in making use of the internal structure of images for information abstraction, we propose to apply the temporal convolution operation to the frame-level representations to extract the dynamic information. However, directly implementing this idea on the original high-dimensional feature would inevitably result in parameter explosion. To tackle this problem, we view the temporal evolution of the feature value at each feature dimension as a 1D signal and learn a unique convolutional filter bank for each of these 1D signals. We conduct experiments on two challenging video-based action recognition datasets, HMDB51 and UCF101; and demonstrate that the proposed method is superior to the conventional pooling methods.

READ FULL TEXT

page 2

page 4

page 7

research
12/06/2015

Rank Pooling for Action Recognition

We propose a function-based temporal pooling method that captures the la...
research
03/04/2015

Temporal Pyramid Pooling Based Convolutional Neural Networks for Action Recognition

Encouraged by the success of Convolutional Neural Networks (CNNs) in ima...
research
11/24/2016

AdaScan: Adaptive Scan Pooling in Deep Convolutional Neural Networks for Human Action Recognition in Videos

We propose a novel method for temporally pooling frames in a video for t...
research
03/07/2016

A novel learning-based frame pooling method for Event Detection

Detecting complex events in a large video collection crawled from video ...
research
04/14/2021

Temporally-Aware Feature Pooling for Action Spotting in Soccer Broadcasts

Toward the goal of automatic production for sports broadcasts, a paramou...
research
03/20/2016

Modelling Temporal Information Using Discrete Fourier Transform for Video Classification

Recently, video classification attracts intensive research efforts. Howe...
research
03/30/2023

Streaming Video Model

Video understanding tasks have traditionally been modeled by two separat...

Please sign up or login with your details

Forgot password? Click here to reset