Dilation-Erosion for Single-Frame Supervised Temporal Action Localization

12/13/2022
by   Bin Wang, et al.
0

To balance the annotation labor and the granularity of supervision, single-frame annotation has been introduced in temporal action localization. It provides a rough temporal location for an action but implicitly overstates the supervision from the annotated-frame during training, leading to the confusion between actions and backgrounds, i.e., action incompleteness and background false positives. To tackle the two challenges, in this work, we present the Snippet Classification model and the Dilation-Erosion module. In the Dilation-Erosion module, we expand the potential action segments with a loose criterion to alleviate the problem of action incompleteness and then remove the background from the potential action segments to alleviate the problem of action incompleteness. Relying on the single-frame annotation and the output of the snippet classification, the Dilation-Erosion module mines pseudo snippet-level ground-truth, hard backgrounds and evident backgrounds, which in turn further trains the Snippet Classification model. It forms a cyclic dependency. Furthermore, we propose a new embedding loss to aggregate the features of action instances with the same label and separate the features of actions from backgrounds. Experiments on THUMOS14 and ActivityNet 1.2 validate the effectiveness of the proposed method. Code has been made publicly available (https://github.com/LingJun123/single-frame-TAL).

READ FULL TEXT

page 2

page 8

page 12

page 21

research
03/15/2020

SF-Net: Single-Frame Supervision for Temporal Action Localization

In this paper, we study an intermediate form of supervision, i.e., singl...
research
11/24/2021

Background-Click Supervision for Temporal Action Localization

Weakly supervised temporal action localization aims at learning the inst...
research
08/11/2021

Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization

We tackle the problem of localizing temporal intervals of actions with o...
research
10/22/2020

Two-Stream Consensus Network for Weakly-Supervised Temporal Action Localization

Weakly-supervised Temporal Action Localization (W-TAL) aims to classify ...
research
08/22/2019

3C-Net: Category Count and Center Loss for Weakly-Supervised Action Localization

Temporal action localization is a challenging computer vision problem wi...
research
05/20/2022

Structured Attention Composition for Temporal Action Localization

Temporal action localization aims at localizing action instances from un...
research
12/15/2020

Point-Level Temporal Action Localization: Bridging Fully-supervised Proposals to Weakly-supervised Losses

Point-Level temporal action localization (PTAL) aims to localize actions...

Please sign up or login with your details

Forgot password? Click here to reset