Higher-order Pooling of CNN Features via Kernel Linearization for Action Recognition

01/19/2017
by   Anoop Cherian, et al.
0

Most successful deep learning algorithms for action recognition extend models designed for image-based tasks such as object recognition to video. Such extensions are typically trained for actions on single video frames or very short clips, and then their predictions from sliding-windows over the video sequence are pooled for recognizing the action at the sequence level. Usually this pooling step uses the first-order statistics of frame-level action predictions. In this paper, we explore the advantages of using higher-order correlations; specifically, we introduce Higher-order Kernel (HOK) descriptors generated from the late fusion of CNN classifier scores from all the frames in a sequence. To generate these descriptors, we use the idea of kernel linearization. Specifically, a similarity kernel matrix, which captures the temporal evolution of deep classifier scores, is first linearized into kernel feature maps. The HOK descriptors are then generated from the higher-order co-occurrences of these feature maps, and are then used as input to a video-level classifier. We provide experiments on two fine-grained action recognition datasets and show that our scheme leads to state-of-the-art results.

READ FULL TEXT
research
04/01/2016

Tensor Representations via Kernel Linearization for Action Recognition from 3D Skeletons (Extended Version)

In this paper, we explore tensor representations that can compactly capt...
research
02/16/2021

Learning to Recognize Actions on Objects in Egocentric Video with Attention Dictionaries

We present EgoACO, a deep neural architecture for video action recogniti...
research
04/06/2017

Action Representation Using Classifier Decision Boundaries

Most popular deep learning based models for action recognition are desig...
research
06/15/2015

Slow and steady feature analysis: higher order temporal coherence in video

How can unlabeled video augment visual learning? Existing methods perfor...
research
11/01/2016

Sliding Dictionary Based Sparse Representation For Action Recognition

The task of action recognition has been in the forefront of research, gi...
research
06/24/2018

A Deeper Look at Power Normalizations

Power Normalizations (PN) are very useful non-linear operators in the co...
research
12/20/2013

EXMOVES: Classifier-based Features for Scalable Action Recognition

This paper introduces EXMOVES, learned exemplar-based features for effic...

Please sign up or login with your details

Forgot password? Click here to reset