Spatio-Temporal Action Detection with Cascade Proposal and Location Anticipation

07/31/2017
by   Zhenheng Yang, et al.
0

In this work, we address the problem of spatio-temporal action detection in temporally untrimmed videos. It is an important and challenging task as finding accurate human actions in both temporal and spatial space is important for analyzing large-scale video data. To tackle this problem, we propose a cascade proposal and location anticipation (CPLA) model for frame-level action detection. There are several salient points of our model: (1) a cascade region proposal network (casRPN) is adopted for action proposal generation and shows better localization accuracy compared with single region proposal network (RPN); (2) action spatio-temporal consistencies are exploited via a location anticipation network (LAN) and thus frame-level action detection is not conducted independently. Frame-level detections are then linked by solving an linking score maximization problem, and temporally trimmed into spatio-temporal action tubes. We demonstrate the effectiveness of our model on the challenging UCF101 and LIRIS-HARL datasets, both achieving state-of-the-art performance.

READ FULL TEXT

page 2

page 4

page 5

page 10

research
04/21/2022

A Multi-Person Video Dataset Annotation Method of Spatio-Temporally Actions

Spatio-temporal action detection is an important and challenging problem...
research
11/20/2018

A Proposal-Based Solution to Spatio-Temporal Action Detection in Untrimmed Videos

Existing approaches for spatio-temporal action detection in videos are l...
research
11/29/2018

Discovering Spatio-Temporal Action Tubes

In this paper, we address the challenging problem of spatial and tempora...
research
11/12/2020

Universal Embeddings for Spatio-Temporal Tagging of Self-Driving Logs

In this paper, we tackle the problem of spatio-temporal tagging of self-...
research
07/22/2017

Spatio-temporal Human Action Localisation and Instance Segmentation in Temporally Untrimmed Videos

Current state-of-the-art human action recognition is focused on the clas...
research
07/07/2016

Untrimmed Video Classification for Activity Detection: submission to ActivityNet Challenge

Current state-of-the-art human activity recognition is focused on the cl...
research
01/30/2018

Object Detection in Videos by Short and Long Range Object Linking

We address the problem of detecting objects in videos with the interest ...

Please sign up or login with your details

Forgot password? Click here to reset