S3D: Single Shot multi-Span Detector via Fully 3D Convolutional Networks

07/21/2018
by   Da Zhang, et al.
0

In this paper, we present a novel Single Shot multi-Span Detector for temporal activity detection in long, untrimmed videos using a simple end-to-end fully three-dimensional convolutional (Conv3D) network. Our architecture, named S3D, encodes the entire video stream and discretizes the output space of temporal activity spans into a set of default spans over different temporal locations and scales. At prediction time, S3D predicts scores for the presence of activity categories in each default span and produces temporal adjustments relative to the span location to predict the precise activity duration. Unlike many state-of-the-art systems that require a separate proposal and classification stage, our S3D is intrinsically simple and dedicatedly designed for single-shot, end-to-end temporal activity detection. When evaluating on THUMOS'14 detection benchmark, S3D achieves state-of-the-art performance and is very efficient and can operate at 1271 FPS.

READ FULL TEXT

page 2

page 10

research
03/31/2020

Revisiting Few-shot Activity Detection with Class Similarity Control

Many interesting events in the real world are rare making preannotated m...
research
12/25/2018

Similarity R-C3D for Few-shot Temporal Activity Detection

Many activities of interest are rare events, with only a few labeled exa...
research
01/28/2018

Contextual Multi-Scale Region Convolutional 3D Network for Activity Detection

Activity detection is a fundamental problem in computer vision. Detectin...
research
03/22/2017

R-C3D: Region Convolutional 3D Network for Temporal Activity Detection

We address the problem of activity detection in continuous, untrimmed vi...
research
06/05/2019

Two-Stream Region Convolutional 3D Network for Temporal Activity Detection

We address the problem of temporal activity detection in continuous, unt...
research
03/16/2018

Activity Detection with Latent Sub-event Hierarchy Learning

In this paper, we introduce a new convolutional layer named the Temporal...
research
08/08/2017

Temporal Context Network for Activity Localization in Videos

We present a Temporal Context Network (TCN) for precise temporal localiz...

Please sign up or login with your details

Forgot password? Click here to reset