Temporal Context Network for Activity Localization in Videos

08/08/2017
by   Xiyang Dai, et al.
0

We present a Temporal Context Network (TCN) for precise temporal localization of human activities. Similar to the Faster-RCNN architecture, proposals are placed at equal intervals in a video which span multiple temporal scales. We propose a novel representation for ranking these proposals. Since pooling features only inside a segment is not sufficient to predict activity boundaries, we construct a representation which explicitly captures context around a proposal for ranking it. For each temporal segment inside a proposal, features are uniformly sampled at a pair of scales and are input to a temporal convolutional neural network for classification. After ranking proposals, non-maximum suppression is applied and classification is performed to obtain final detections. TCN outperforms state-of-the-art methods on the ActivityNet dataset and the THUMOS14 dataset.

READ FULL TEXT

page 2

page 8

research
07/12/2018

CTAP: Complementary Temporal Action Proposal Generation

Temporal action proposal generation is an important task, akin to object...
research
09/14/2021

Adaptive Proposal Generation Network for Temporal Sentence Localization in Videos

We address the problem of temporal sentence localization in videos (TSLV...
research
05/30/2017

Generic Tubelet Proposals for Action Localization

We develop a novel framework for action localization in videos. We propo...
research
03/17/2017

TURN TAP: Temporal Unit Regression Network for Temporal Action Proposals

Temporal Action Proposal (TAP) generation is an important problem, as fa...
research
12/08/2019

Learning Sparse 2D Temporal Adjacent Networks for Temporal Action Localization

In this report, we introduce the Winner method for HACS Temporal Action ...
research
07/21/2018

S3D: Single Shot multi-Span Detector via Fully 3D Convolutional Networks

In this paper, we present a novel Single Shot multi-Span Detector for te...
research
07/15/2021

What and When to Look?: Temporal Span Proposal Network for Video Visual Relation Detection

Identifying relations between objects is central to understanding the sc...

Please sign up or login with your details

Forgot password? Click here to reset