EXMOVES: Classifier-based Features for Scalable Action Recognition

12/20/2013
by   Du Tran, et al.
0

This paper introduces EXMOVES, learned exemplar-based features for efficient recognition of actions in videos. The entries in our descriptor are produced by evaluating a set of movement classifiers over spatial-temporal volumes of the input sequence. Each movement classifier is a simple exemplar-SVM trained on low-level features, i.e., an SVM learned using a single annotated positive space-time volume and a large number of unannotated videos. Our representation offers two main advantages. First, since our mid-level features are learned from individual video exemplars, they require minimal amount of supervision. Second, we show that simple linear classification models trained on our global video descriptor yield action recognition accuracy approaching the state-of-the-art but at orders of magnitude lower cost, since at test-time no sliding window is necessary and linear models are efficient to train and test. This enables scalable action recognition, i.e., efficient classification of a large number of different actions even in large video databases. We show the generality of our approach by building our mid-level descriptors from two different low-level feature representations. The accuracy and efficiency of the approach are demonstrated on several large-scale action recognition benchmarks.

READ FULL TEXT
research
01/29/2018

Histogram of Oriented Depth Gradients for Action Recognition

In this paper, we report on experiments with the use of local measures f...
research
11/25/2019

Deep Image-to-Video Adaptation and Fusion Networks for Action Recognition

Existing deep learning methods for action recognition in videos require ...
research
04/19/2022

On the Performance Evaluation of Action Recognition Models on Transcoded Low Quality Videos

In the design of action recognition models, the quality of videos in the...
research
11/22/2016

Learning Multi-level Features For Sensor-based Human Action Recognition

This paper proposes a multi-level feature learning framework for human a...
research
01/19/2017

Higher-order Pooling of CNN Features via Kernel Linearization for Action Recognition

Most successful deep learning algorithms for action recognition extend m...
research
02/02/2021

GCF-Net: Gated Clip Fusion Network for Video Action Recognition

In recent years, most of the accuracy gains for video action recognition...
research
12/03/2014

Gradient Boundary Histograms for Action Recognition

This paper introduces a high efficient local spatiotemporal descriptor, ...

Please sign up or login with your details

Forgot password? Click here to reset