Temporal Context Aggregation Network for Temporal Action Proposal Refinement

by   Zhiwu Qing, et al.

Temporal action proposal generation aims to estimate temporal intervals of actions in untrimmed videos, which is a challenging yet important task in the video understanding field. The proposals generated by current methods still suffer from inaccurate temporal boundaries and inferior confidence used for retrieval owing to the lack of efficient temporal modeling and effective boundary context utilization. In this paper, we propose Temporal Context Aggregation Network (TCANet) to generate high-quality action proposals through "local and global" temporal context aggregation and complementary as well as progressive boundary refinement. Specifically, we first design a Local-Global Temporal Encoder (LGTE), which adopts the channel grouping strategy to efficiently encode both "local and global" temporal inter-dependencies. Furthermore, both the boundary and internal context of proposals are adopted for frame-level and segment-level boundary regressions, respectively. Temporal Boundary Regressor (TBR) is designed to combine these two regression granularities in an end-to-end fashion, which achieves the precise boundaries and reliable confidence of proposals through progressive refinement. Extensive experiments are conducted on three challenging datasets: HACS, ActivityNet-v1.3, and THUMOS-14, where TCANet can generate proposals with high precision and recall. By combining with the existing action classifier, TCANet can obtain remarkable temporal action detection performance compared with other methods. Not surprisingly, the proposed TCANet won the 1^st place in the CVPR 2020 - HACS challenge leaderboard on temporal action localization task.


page 3

page 8


BSN: Boundary Sensitive Network for Temporal Action Proposal Generation

Temporal action proposal generation is an important yet challenging prob...

Precise Temporal Action Localization by Evolving Temporal Proposals

Locating actions in long untrimmed videos has been a challenging problem...

Enriching Local and Global Contexts for Temporal Action Localization

Effectively tackling the problem of temporal action localization (TAL) n...

Boundary Content Graph Neural Network for Temporal Action Proposal Generation

Temporal action proposal generation plays an important role in video act...

DCAN: Improving Temporal Action Detection via Dual Context Aggregation

Temporal action detection aims to locate the boundaries of action in the...

BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation

Generating human action proposals in untrimmed videos is an important ye...

Fast Learning of Temporal Action Proposal via Dense Boundary Generator

Generating temporal action proposals remains a very challenging problem,...

Please sign up or login with your details

Forgot password? Click here to reset