Colar: Effective and Efficient Online Action Detection by Consulting Exemplars

03/02/2022
by   Le Yang, et al.
0

Online action detection has attracted increasing research interests in recent years. Current works model historical dependencies and anticipate future to perceive the action evolution within a video segment and improve the detection accuracy. However, the existing paradigm ignores category-level modeling and does not pay sufficient attention to efficiency. Considering a category, its representative frames exhibit various characteristics. Thus, the category-level modeling can provide complementary guidance to the temporal dependencies modeling. In this paper, we develop an effective exemplar-consultation mechanism that first measures the similarity between a frame and exemplary frames, and then aggregates exemplary features based on the similarity weights. This is also an efficient mechanism as both similarity measurement and feature aggregation require limited computations. Based on the exemplar-consultation mechanism, the long-term dependencies can be captured by regarding historical frames as exemplars, and the category-level modeling can be achieved by regarding representative frames from a category as exemplars. Due to the complementarity from the category-level modeling, our method employs a lightweight architecture but achieves new high performance on three benchmarks. In addition, using a spatio-temporal network to tackle video frames, our method spends 9.8 seconds to dispose of a one-minute video and achieves comparable performance.

READ FULL TEXT

page 1

page 3

page 8

research
07/07/2021

Long Short-Term Transformer for Online Action Detection

In this paper, we present Long Short-term TRansformer (LSTR), a new temp...
research
07/21/2022

An Efficient Spatio-Temporal Pyramid Transformer for Action Detection

The task of action detection aims at deducing both the action category a...
research
01/21/2020

A Comprehensive Study on Temporal Modeling for Online Action Detection

Online action detection (OAD) is a practical yet challenging task, which...
research
09/11/2018

Temporal-Spatial Mapping for Action Recognition

Deep learning models have enjoyed great success for image related comput...
research
05/26/2022

Efficient U-Transformer with Boundary-Aware Loss for Action Segmentation

Action classification has made great progress, but segmenting and recogn...
research
08/18/2021

Target Adaptive Context Aggregation for Video Scene Graph Generation

This paper deals with a challenging task of video scene graph generation...

Please sign up or login with your details

Forgot password? Click here to reset