DeepAI AI Chat
Log In Sign Up

Enriching Local and Global Contexts for Temporal Action Localization

07/27/2021
by   Zixin Zhu, et al.
Xi'an Jiaotong University
University of Illinois at Chicago
0

Effectively tackling the problem of temporal action localization (TAL) necessitates a visual representation that jointly pursues two confounding goals, i.e., fine-grained discrimination for temporal localization and sufficient visual invariance for action classification. We address this challenge by enriching both the local and global contexts in the popular two-stage temporal localization framework, where action proposals are first generated followed by action classification and temporal boundary regression. Our proposed model, dubbed ContextLoc, can be divided into three sub-networks: L-Net, G-Net and P-Net. L-Net enriches the local context via fine-grained modeling of snippet-level features, which is formulated as a query-and-retrieval process. G-Net enriches the global context via higher-level modeling of the video-level representation. In addition, we introduce a novel context adaptation module to adapt the global context to different proposals. P-Net further models the context-aware inter-proposal relations. We explore two existing models to be the P-Net in our experiments. The efficacy of our proposed method is validated by experimental results on the THUMOS14 (54.3% at IoU@0.5) and ActivityNet v1.3 (51.24% at IoU@0.5) datasets, which outperforms recent states of the art.

READ FULL TEXT

page 1

page 3

page 8

03/24/2021

Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Temporal action proposal generation aims to estimate temporal intervals ...
03/22/2021

Context-aware Biaffine Localizing Network for Temporal Sentence Grounding

This paper addresses the problem of temporal sentence grounding (TSG), w...
12/08/2019

Learning Sparse 2D Temporal Adjacent Networks for Temporal Action Localization

In this report, we introduce the Winner method for HACS Temporal Action ...
10/02/2020

Enhancing Fine-grained Sentiment Classification Exploiting Local Context Embedding

Target-oriented sentiment classification is a fine-grained task of natur...
04/13/2018

Precise Temporal Action Localization by Evolving Temporal Proposals

Locating actions in long untrimmed videos has been a challenging problem...
08/02/2019

Scale Matters: Temporal Scale Aggregation Network for Precise Action Localization in Untrimmed Videos

Temporal action localization is a recently-emerging task, aiming to loca...
08/08/2017

Temporal Context Network for Activity Localization in Videos

We present a Temporal Context Network (TCN) for precise temporal localiz...