Background Modeling via Uncertainty Estimation for Weakly-supervised Action Localization

06/12/2020
by   Pilhyeon Lee, et al.
0

Weakly-supervised temporal action localization aims to detect intervals of action instances with only video-level action labels for training. A crucial challenge is to separate frames of action classes from remaining, denoted as background frames (i.e., frames not belonging to any action class). Previous methods attempt background modeling by either synthesizing pseudo background videos with static frames or introducing an auxiliary class for background. However, they overlook an essential fact that background frames could be dynamic and inconsistent. Accordingly, we cast the problem of identifying background frames as out-of-distribution detection and isolate it from conventional action classification. Beyond our base action localization network, we propose a module to estimate the probability of being background (i.e., uncertainty [20]), which allows us to learn uncertainty given only video-level labels via multiple instance learning. A background entropy loss is further designed to reject background frames by forcing them to have uniform probability distribution for action classes. Extensive experiments verify the effectiveness of our background modeling and show that our method significantly outperforms state-of-the-art methods on the standard benchmarks - THUMOS'14 and ActivityNet (1.2 and 1.3). Our code and the trained model are available at https://github.com/Pilhyeon/Background-Modeling-via-Uncertainty-Estimation.

READ FULL TEXT

page 2

page 9

research
11/22/2019

Background Suppression Network for Weakly-supervised Temporal Action Localization

Weakly-supervised temporal action localization is a very challenging pro...
research
11/24/2021

Background-Click Supervision for Temporal Action Localization

Weakly supervised temporal action localization aims at learning the inst...
research
07/14/2022

Forcing the Whole Video as Background: An Adversarial Learning Strategy for Weakly Temporal Action Localization

With video-level labels, weakly supervised temporal action localization ...
research
03/24/2021

The Blessings of Unlabeled Background in Untrimmed Videos

Weakly-supervised Temporal Action Localization (WTAL) aims to detect the...
research
04/23/2021

Eigenbackground Revisited: Can We Model the Background with Eigenvectors?

Using dominant eigenvectors for background modeling (usually known as Ei...
research
05/06/2021

Weakly Supervised Action Selection Learning in Video

Localizing actions in video is a core task in computer vision. The weakl...
research
08/11/2021

Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization

We tackle the problem of localizing temporal intervals of actions with o...

Please sign up or login with your details

Forgot password? Click here to reset