Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization

08/11/2021
by   Pilhyeon Lee, et al.
5

We tackle the problem of localizing temporal intervals of actions with only a single frame label for each action instance for training. Owing to label sparsity, existing work fails to learn action completeness, resulting in fragmentary action predictions. In this paper, we propose a novel framework, where dense pseudo-labels are generated to provide completeness guidance for the model. Concretely, we first select pseudo background points to supplement point-level action labels. Then, by taking the points as seeds, we search for the optimal sequence that is likely to contain complete action instances while agreeing with the seeds. To learn completeness from the obtained sequence, we introduce two novel losses that contrast action instances with background ones in terms of action score and feature similarity, respectively. Experimental results demonstrate that our completeness guidance indeed helps the model to locate complete action instances, leading to large performance gains especially under high IoU thresholds. Moreover, we demonstrate the superiority of our method over existing state-of-the-art methods on four benchmarks: THUMOS'14, GTEA, BEOID, and ActivityNet. Notably, our method even performs comparably to recent fully-supervised methods, at the 6 times cheaper annotation cost. Our code is available at https://github.com/Pilhyeon.

READ FULL TEXT

page 3

page 8

page 12

page 13

page 14

page 15

research
09/16/2023

Sub-action Prototype Learning for Point-level Weakly-supervised Temporal Action Localization

Point-level weakly-supervised temporal action localization (PWTAL) aims ...
research
06/12/2020

Background Modeling via Uncertainty Estimation for Weakly-supervised Action Localization

Weakly-supervised temporal action localization aims to detect intervals ...
research
11/15/2021

Weakly-Supervised Dense Action Anticipation

Dense anticipation aims to forecast future actions and their durations f...
research
12/13/2022

Dilation-Erosion for Single-Frame Supervised Temporal Action Localization

To balance the annotation labor and the granularity of supervision, sing...
research
03/31/2022

Fine-grained Temporal Contrastive Learning for Weakly-supervised Temporal Action Localization

We target at the task of weakly-supervised action localization (WSAL), w...
research
10/10/2022

An Action Is Worth Multiple Words: Handling Ambiguity in Action Recognition

Precisely naming the action depicted in a video can be a challenging and...
research
07/02/2022

Turning to a Teacher for Timestamp Supervised Temporal Action Segmentation

Temporal action segmentation in videos has drawn much attention recently...

Please sign up or login with your details

Forgot password? Click here to reset