Background Suppression Network for Weakly-supervised Temporal Action Localization

11/22/2019
by   Pilhyeon Lee, et al.
0

Weakly-supervised temporal action localization is a very challenging problem because frame-wise labels are not given in the training stage while the only hint is video-level labels: whether each video contains action frames of interest. Previous methods aggregate frame-level class scores to produce video-level prediction and learn from video-level action labels. This formulation does not fully model the problem in that background frames are forced to be misclassified as action classes to predict video-level labels accurately. In this paper, we design Background Suppression Network (BaS-Net) which introduces an auxiliary class for background and has a two-branch weight-sharing architecture with an asymmetrical training strategy. This enables BaS-Net to suppress activations from background frames to improve localization performance. Extensive experiments demonstrate the effectiveness of BaS-Net and its superiority over the state-of-the-art methods on the most popular benchmarks - THUMOS'14 and ActivityNet. Our code and the trained model are available at https://github.com/Pilhyeon/BaSNet-pytorch.

READ FULL TEXT

page 2

page 7

research
06/12/2020

Background Modeling via Uncertainty Estimation for Weakly-supervised Action Localization

Weakly-supervised temporal action localization aims to detect intervals ...
research
04/07/2021

ACM-Net: Action Context Modeling Network for Weakly-Supervised Temporal Action Localization

Weakly-supervised temporal action localization aims to localize action i...
research
05/06/2021

Weakly Supervised Action Selection Learning in Video

Localizing actions in video is a core task in computer vision. The weakl...
research
03/31/2022

Fine-grained Temporal Contrastive Learning for Weakly-supervised Temporal Action Localization

We target at the task of weakly-supervised action localization (WSAL), w...
research
03/27/2020

Weakly-Supervised Action Localization by Generative Attention Modeling

Weakly-supervised temporal action localization is a problem of learning ...
research
09/23/2022

Weakly Supervised Two-Stage Training Scheme for Deep Video Fight Detection Model

Fight detection in videos is an emerging deep learning application with ...
research
11/24/2021

Background-Click Supervision for Temporal Action Localization

Weakly supervised temporal action localization aims at learning the inst...

Please sign up or login with your details

Forgot password? Click here to reset