Two-Stream Consensus Network for Weakly-Supervised Temporal Action Localization

10/22/2020
by   Yuanhao Zhai, et al.
0

Weakly-supervised Temporal Action Localization (W-TAL) aims to classify and localize all action instances in an untrimmed video under only video-level supervision. However, without frame-level annotations, it is challenging for W-TAL methods to identify false positive action proposals and generate action proposals with precise temporal boundaries. In this paper, we present a Two-Stream Consensus Network (TSCN) to simultaneously address these challenges. The proposed TSCN features an iterative refinement training method, where a frame-level pseudo ground truth is iteratively updated, and used to provide frame-level supervision for improved model training and false positive action proposal elimination. Furthermore, we propose a new attention normalization loss to encourage the predicted attention to act like a binary selection, and promote the precise localization of action instance boundaries. Experiments conducted on the THUMOS14 and ActivityNet datasets show that the proposed TSCN outperforms current state-of-the-art methods, and even achieves comparable results with some recent fully-supervised methods.

READ FULL TEXT
research
03/30/2019

RefineLoc: Iterative Refinement for Weakly-Supervised Action Localization

Video action detectors are usually trained using video datasets with ful...
research
12/15/2020

Point-Level Temporal Action Localization: Bridging Fully-supervised Proposals to Weakly-supervised Losses

Point-Level temporal action localization (PTAL) aims to localize actions...
research
06/21/2021

Two-Stream Consensus Network: Submission to HACS Challenge 2021 Weakly-Supervised Learning Track

This technical report presents our solution to the HACS Temporal Action ...
research
10/24/2019

LPAT: Learning to Predict Adaptive Threshold for Weakly-supervised Temporal Action Localization

Recently, Weakly-supervised Temporal Action Localization (WTAL) has been...
research
12/13/2022

Dilation-Erosion for Single-Frame Supervised Temporal Action Localization

To balance the annotation labor and the granularity of supervision, sing...
research
08/10/2017

Exploring Temporal Preservation Networks for Precise Temporal Action Localization

Temporal action localization is an important task of computer vision. Th...
research
08/24/2023

HR-Pro: Point-supervised Temporal Action Localization via Hierarchical Reliability Propagation

Point-supervised Temporal Action Localization (PSTAL) is an emerging res...

Please sign up or login with your details

Forgot password? Click here to reset