Finding Action Tubes

11/21/2014
by   Georgia Gkioxari, et al.
0

We address the problem of action detection in videos. Driven by the latest progress in object detection from 2D images, we build action models using rich feature hierarchies derived from shape and kinematic cues. We incorporate appearance and motion in two ways. First, starting from image region proposals we select those that are motion salient and thus are more likely to contain the action. This leads to a significant reduction in the number of regions being processed and allows for faster computations. Second, we extract spatio-temporal feature representations to build strong classifiers using Convolutional Neural Networks. We link our predictions to produce detections consistent in time, which we call action tubes. We show that our approach outperforms other techniques in the task of action detection.

READ FULL TEXT

page 7

page 8

research
11/20/2018

A Proposal-Based Solution to Spatio-Temporal Action Detection in Untrimmed Videos

Existing approaches for spatio-temporal action detection in videos are l...
research
11/29/2018

Discovering Spatio-Temporal Action Tubes

In this paper, we address the challenging problem of spatial and tempora...
research
04/19/2019

STEP: Spatio-Temporal Progressive Learning for Video Action Detection

In this paper, we propose Spatio-TEmporal Progressive (STEP) action dete...
research
05/26/2016

Automatic Action Annotation in Weakly Labeled Videos

Manual spatio-temporal annotation of human action in videos is laborious...
research
05/04/2017

Am I Done? Predicting Action Progress in Videos

In this paper we introduce the problem of predicting action progress in ...
research
02/14/2023

YOWOv2: A Stronger yet Efficient Multi-level Detection Framework for Real-time Spatio-temporal Action Detection

Designing a real-time framework for the spatio-temporal action detection...

Please sign up or login with your details

Forgot password? Click here to reset