Generalized Rank Pooling for Activity Recognition

04/07/2017
by   Anoop Cherian, et al.
0

Most popular deep models for action recognition split video sequences into short sub-sequences consisting of a few frames; frame-based features are then pooled for recognizing the activity. Usually, this pooling step discards the temporal order of the frames, which could otherwise be used for better recognition. Towards this end, we propose a novel pooling method, generalized rank pooling (GRP), that takes as input, features from the intermediate layers of a CNN that is trained on tiny sub-sequences, and produces as output the parameters of a subspace which (i) provides a low-rank approximation to the features and (ii) preserves their temporal order. We propose to use these parameters as a compact representation for the video sequence, which is then used in a classification setup. We formulate an objective for computing this subspace as a Riemannian optimization problem on the Grassmann manifold, and propose an efficient conjugate gradient scheme for solving it. Experiments on several activity recognition datasets show that our scheme leads to state-of-the-art performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/27/2018

Non-Linear Temporal Subspace Representations for Activity Recognition

Representations that can compactly and effectively capture the temporal ...
research
05/30/2017

Discriminatively Learned Hierarchical Rank Pooling Networks

In this work, we present novel temporal encoding methods for action and ...
research
05/24/2017

Sequence Summarization Using Order-constrained Kernelized Feature Subspaces

Representations that can compactly and effectively capture temporal evol...
research
04/06/2017

Action Representation Using Classifier Decision Boundaries

Most popular deep learning based models for action recognition are desig...
research
03/26/2018

Video Representation Learning Using Discriminative Pooling

Popular deep models for action recognition in videos generate independen...
research
09/05/2019

Discriminative Video Representation Learning Using Support Vector Classifiers

Most popular deep models for action recognition in videos generate indep...
research
07/24/2018

Learning Discriminative Video Representations Using Adversarial Perturbations

Adversarial perturbations are noise-like patterns that can subtly change...

Please sign up or login with your details

Forgot password? Click here to reset