AdaScan: Adaptive Scan Pooling in Deep Convolutional Neural Networks for Human Action Recognition in Videos

11/24/2016
by   Amlan Kar, et al.
0

We propose a novel method for temporally pooling frames in a video for the task of human action recognition. The method is motivated by the observation that there are only a small number of frames which, together, contain sufficient information to discriminate an action class present in a video, from the rest. The proposed method learns to pool such discriminative and informative frames, while discarding a majority of the non-informative frames in a single temporal scan of the video. Our algorithm does so by continuously predicting the discriminative importance of each video frame and subsequently pooling them in a deep learning framework. We show the effectiveness of our proposed pooling method on standard benchmarks where it consistently improves on baseline pooling methods, with both RGB and optical flow based Convolutional networks. Further, in combination with complementary video representations, we show results that are competitive with respect to the state-of-the-art results on two challenging and publicly available benchmark datasets.

READ FULL TEXT

page 1

page 8

research
01/31/2016

Order-aware Convolutional Pooling for Video Based Action Recognition

Most video based action recognition approaches create the video-level re...
research
03/29/2021

No frame left behind: Full Video Action Recognition

Not all video frames are equally informative for recognizing an action. ...
research
04/14/2021

Temporally-Aware Feature Pooling for Action Spotting in Soccer Broadcasts

Toward the goal of automatic production for sports broadcasts, a paramou...
research
11/22/2017

Multi-Level Recurrent Residual Networks for Action Recognition

Most existing Convolutional Neural Networks(CNNs) used for action recogn...
research
03/26/2018

Video Representation Learning Using Discriminative Pooling

Popular deep models for action recognition in videos generate independen...
research
09/05/2019

Discriminative Video Representation Learning Using Support Vector Classifiers

Most popular deep models for action recognition in videos generate indep...
research
01/12/2017

Ordered Pooling of Optical Flow Sequences for Action Recognition

Training of Convolutional Neural Networks (CNNs) on long video sequences...

Please sign up or login with your details

Forgot password? Click here to reset