Learning Salient Boundary Feature for Anchor-free Temporal Action Localization

03/24/2021
by   Chuming Lin, et al.
0

Temporal action localization is an important yet challenging task in video understanding. Typically, such a task aims at inferring both the action category and localization of the start and end frame for each action instance in a long, untrimmed video.While most current models achieve good results by using pre-defined anchors and numerous actionness, such methods could be bothered with both large number of outputs and heavy tuning of locations and sizes corresponding to different anchors. Instead, anchor-free methods is lighter, getting rid of redundant hyper-parameters, but gains few attention. In this paper, we propose the first purely anchor-free temporal localization method, which is both efficient and effective. Our model includes (i) an end-to-end trainable basic predictor, (ii) a saliency-based refinement module to gather more valuable boundary features for each proposal with a novel boundary pooling, and (iii) several consistency constraints to make sure our model can find the accurate boundary given arbitrary proposals. Extensive experiments show that our method beats all anchor-based and actionness-guided methods with a remarkable margin on THUMOS14, achieving state-of-the-art results, and comparable ones on ActivityNet v1.3. Code is available at https://github.com/TencentYoutuResearch/ActionDetection-AFSD.

READ FULL TEXT
research
08/22/2020

Revisiting Anchor Mechanisms for Temporal Action Localization

Most of the current action localization methods follow an anchor-based p...
research
04/25/2022

Estimation of Reliable Proposal Quality for Temporal Action Detection

Temporal action detection (TAD) aims to locate and recognize the actions...
research
07/20/2022

HTNet: Anchor-free Temporal Action Localization with Hierarchical Transformers

Temporal action localization (TAL) is a task of identifying a set of act...
research
03/16/2023

TemporalMaxer: Maximize Temporal Context with only Max Pooling for Temporal Action Localization

Temporal Action Localization (TAL) is a challenging task in video unders...
research
09/17/2019

Deep Point-wise Prediction for Action Temporal Proposal

Detecting actions in videos is an important yet challenging task. Previo...
research
08/21/2023

UnLoc: A Unified Framework for Video Localization Tasks

While large-scale image-text pretrained models such as CLIP have been us...
research
06/07/2022

TadML: A fast temporal action detection with Mechanics-MLP

Temporal Action Detection(TAD) is a crucial but challenging task in vide...

Please sign up or login with your details

Forgot password? Click here to reset