A Bag-of-Words Equivalent Recurrent Neural Network for Action Recognition

03/23/2017
by   Alexander Richard, et al.
0

The traditional bag-of-words approach has found a wide range of applications in computer vision. The standard pipeline consists of a generation of a visual vocabulary, a quantization of the features into histograms of visual words, and a classification step for which usually a support vector machine in combination with a non-linear kernel is used. Given large amounts of data, however, the model suffers from a lack of discriminative power. This applies particularly for action recognition, where the vast amount of video features needs to be subsampled for unsupervised visual vocabulary generation. Moreover, the kernel computation can be very expensive on large datasets. In this work, we propose a recurrent neural network that is equivalent to the traditional bag-of-words approach but enables for the application of discriminative training. The model further allows to incorporate the kernel computation into the neural network directly, solving the complexity issue and allowing to represent the complete classification system within a single network. We evaluate our method on four recent action recognition benchmarks and show that the conventional model as well as sparse coding methods are outperformed.

READ FULL TEXT
research
05/29/2014

Feature sampling and partitioning for visual vocabulary generation on large action classification datasets

The recent trend in action recognition is towards larger datasets, an in...
research
07/20/2014

Feature and Region Selection for Visual Learning

Visual learning problems such as object classification and action recogn...
research
05/18/2014

Bag of Visual Words and Fusion Methods for Action Recognition: Comprehensive Study and Good Practice

Video based action recognition is one of the important and challenging p...
research
09/30/2018

Improving Bag-of-Visual-Words Towards Effective Facial Expressive Image Classification

Bag-of-Visual-Words (BoVW) approach has been widely used in the recent y...
research
07/23/2014

Visual Word Selection without Re-Coding and Re-Pooling

The Bag-of-Words (BoW) representation is widely used in computer vision....
research
04/23/2013

A Bag of Visual Words Approach for Symbols-Based Coarse-Grained Ancient Coin Classification

The field of Numismatics provides the names and descriptions of the symb...
research
11/28/2017

Scalable and Compact 3D Action Recognition with Approximated RBF Kernel Machines

Despite the recent deep learning (DL) revolution, kernel machines still ...

Please sign up or login with your details

Forgot password? Click here to reset