Joint Recognition and Segmentation of Actions via Probabilistic Integration of Spatio-Temporal Fisher Vectors

02/04/2016
by   Johanna Carvajal, et al.
0

We propose a hierarchical approach to multi-action recognition that performs joint classification and segmentation. A given video (containing several consecutive actions) is processed via a sequence of overlapping temporal windows. Each frame in a temporal window is represented through selective low-level spatio-temporal features which efficiently capture relevant local dynamics. Features from each window are represented as a Fisher vector, which captures first and second order statistics. Instead of directly classifying each Fisher vector, it is converted into a vector of class probabilities. The final classification decision for each frame is then obtained by integrating the class probabilities at the frame level, which exploits the overlapping of the temporal windows. Experiments were performed on two datasets: s-KTH (a stitched version of the KTH dataset to simulate multi-actions), and the challenging CMU-MMAC dataset. On s-KTH, the proposed approach achieves an accuracy of 85.0 GMMs and HMMs which obtained 78.3 proposed approach achieves an accuracy of 40.9 approaches which obtained 33.7 proposed system is on average 40 times faster than the GMM based approach.

READ FULL TEXT

page 2

page 10

research
02/06/2015

Multi-Action Recognition via Stochastic Modelling of Optical Flow and Gradients

In this paper we propose a novel approach to multi-action recognition th...
research
12/09/2021

Spatio-temporal Relation Modeling for Few-shot Action Recognition

We propose a novel few-shot action recognition framework, STRM, which en...
research
11/16/2021

SequentialPointNet: A strong parallelized point cloud sequence network for 3D action recognition

Point cloud sequences of 3D human actions exhibit unordered intra-frame ...
research
08/02/2016

Spatio-temporal Co-Occurrence Characterizations for Human Action Classification

The human action classification task is a widely researched topic and is...
research
10/05/2021

Efficient Modelling Across Time of Human Actions and Interactions

This thesis focuses on video understanding for human action and interact...
research
04/23/2021

Modeling long-term interactions to enhance action recognition

In this paper, we propose a new approach to under-stand actions in egoce...
research
08/24/2017

Relaxed Spatio-Temporal Deep Feature Aggregation for Real-Fake Expression Prediction

Frame-level visual features are generally aggregated in time with the te...

Please sign up or login with your details

Forgot password? Click here to reset