CBR-Net: Cascade Boundary Refinement Network for Action Detection: Submission to ActivityNet Challenge 2020 (Task 1)

by   Xiang Wang, et al.

In this report, we present our solution for the task of temporal action localization (detection) (task 1) in ActivityNet Challenge 2020. The purpose of this task is to temporally localize intervals where actions of interest occur and predict the action categories in a long untrimmed video. Our solution mainly includes three components: 1) feature encoding: we apply three kinds of backbones, including TSN [7], Slowfast[3] and I3d[1], which are both pretrained on Kinetics dataset[2]. Applying these models, we can extract snippet-level video representations; 2) proposal generation: we choose BMN [5] as our baseline, base on which we design a Cascade Boundary Refinement Network (CBR-Net) to conduct proposal detection. The CBR-Net mainly contains two modules: temporal feature encoding, which applies BiLSTM to encode long-term temporal information; CBR module, which targets to refine the proposal precision under different parameter settings; 3) action localization: In this stage, we combine the video-level classification results obtained by the fine tuning networks to predict the category of each proposal. Moreover, we also apply to different ensemble strategies to improve the performance of the designed solution, by which we achieve 42.788 ActivityNet v1.3 dataset in terms of mean Average Precision metrics.


Proposal Relation Network for Temporal Action Detection

This technical report presents our solution for temporal action detectio...

Temporal Fusion Network for Temporal Action Localization:Submission to ActivityNet Challenge 2020 (Task E)

This technical report analyzes a temporal action localization method we ...

Relation-Aware Pyramid Network (RapNet) for temporal action proposal

In this technical report, we describe our solution to temporal action pr...

Context-aware Proposal Network for Temporal Action Detection

This technical report presents our first place winning solution for temp...

Multi-Granularity Fusion Network for Proposal and Activity Localization: Submission to ActivityNet Challenge 2019 Task 1 and Task 2

This technical report presents an overview of our solution used in the s...

Estimation of Reliable Proposal Quality for Temporal Action Detection

Temporal action detection (TAD) aims to locate and recognize the actions...

Scale Matters: Temporal Scale Aggregation Network for Precise Action Localization in Untrimmed Videos

Temporal action localization is a recently-emerging task, aiming to loca...

Please sign up or login with your details

Forgot password? Click here to reset