ACM-Net: Action Context Modeling Network for Weakly-Supervised Temporal Action Localization

04/07/2021
by   Sanqing Qu, et al.
0

Weakly-supervised temporal action localization aims to localize action instances temporal boundary and identify the corresponding action category with only video-level labels. Traditional methods mainly focus on foreground and background frames separation with only a single attention branch and class activation sequence. However, we argue that apart from the distinctive foreground and background frames there are plenty of semantically ambiguous action context frames. It does not make sense to group those context frames to the same background class since they are semantically related to a specific action category. Consequently, it is challenging to suppress action context frames with only a single class activation sequence. To address this issue, in this paper, we propose an action-context modeling network termed ACM-Net, which integrates a three-branch attention module to measure the likelihood of each temporal point being action instance, context, or non-action background, simultaneously. Then based on the obtained three-branch attention values, we construct three-branch class activation sequences to represent the action instances, contexts, and non-action backgrounds, individually. To evaluate the effectiveness of our ACM-Net, we conduct extensive experiments on two benchmark datasets, THUMOS-14 and ActivityNet-1.3. The experiments show that our method can outperform current state-of-the-art methods, and even achieve comparable performance with fully-supervised methods. Code can be found at https://github.com/ispc-lab/ACM-Net

READ FULL TEXT

page 1

page 2

page 9

research
11/22/2019

Background Suppression Network for Weakly-supervised Temporal Action Localization

Weakly-supervised temporal action localization is a very challenging pro...
research
03/28/2021

ACSNet: Action-Context Separation Network for Weakly Supervised Temporal Action Localization

The object of Weakly-supervised Temporal Action Localization (WS-TAL) is...
research
03/27/2020

Weakly-Supervised Action Localization by Generative Attention Modeling

Weakly-supervised temporal action localization is a problem of learning ...
research
04/06/2021

Adaptive Mutual Supervision for Weakly-Supervised Temporal Action Localization

Weakly-supervised temporal action localization aims to localize actions ...
research
12/19/2022

Distilling Vision-Language Pre-training to Collaborate with Weakly-Supervised Temporal Action Localization

Weakly-supervised temporal action localization (WTAL) learns to detect a...
research
08/14/2021

Foreground-Action Consistency Network for Weakly Supervised Temporal Action Localization

As a challenging task of high-level video understanding, weakly supervis...
research
11/24/2021

Background-Click Supervision for Temporal Action Localization

Weakly supervised temporal action localization aims at learning the inst...

Please sign up or login with your details

Forgot password? Click here to reset