Segmental Spatiotemporal CNNs for Fine-grained Action Segmentation

02/09/2016
by   Colin Lea, et al.
0

Joint segmentation and classification of fine-grained actions is important for applications of human-robot interaction, video surveillance, and human skill evaluation. However, despite substantial recent progress in large-scale action classification, the performance of state-of-the-art fine-grained action recognition approaches remains low. We propose a model for action segmentation which combines low-level spatiotemporal features with a high-level segmental classifier. Our spatiotemporal CNN is comprised of a spatial component that uses convolutional filters to capture information about objects and their relationships, and a temporal component that uses large 1D convolutional filters to capture information about how object relationships change across time. These features are used in tandem with a semi-Markov model that models transitions from one action to another. We introduce an efficient constrained segmental inference algorithm for this model that is orders of magnitude faster than the current approach. We highlight the effectiveness of our Segmental Spatiotemporal CNN on cooking and surgical action datasets for which we observe substantially improved performance relative to recent baseline methods.

READ FULL TEXT

page 2

page 6

page 13

research
11/16/2016

Temporal Convolutional Networks for Action Segmentation and Detection

The ability to identify and temporally segment fine-grained human action...
research
08/18/2020

ConvGRU in Fine-grained Pitching Action Recognition for Action Outcome Prediction

Prediction of the action outcome is a new challenge for a robot collabor...
research
08/29/2016

Temporal Convolutional Networks: A Unified Approach to Action Segmentation

The dominant paradigm for video-based action segmentation is composed of...
research
11/24/2022

Hand Guided High Resolution Feature Enhancement for Fine-Grained Atomic Action Segmentation within Complex Human Assemblies

Due to the rapid temporal and fine-grained nature of complex human assem...
research
10/25/2021

MoDeRNN: Towards Fine-grained Motion Details for Spatiotemporal Predictive Learning

Spatiotemporal predictive learning (ST-PL) aims at predicting the subseq...
research
11/18/2020

A framework for the fine-grained evaluation of the instantaneous expected value of soccer possessions

The expected possession value (EPV) of a soccer possession represents th...
research
04/10/2020

Spatiotemporal Fusion in 3D CNNs: A Probabilistic View

Despite the success in still image recognition, deep neural networks for...

Please sign up or login with your details

Forgot password? Click here to reset