MoLo: Motion-augmented Long-short Contrastive Learning for Few-shot Action Recognition

04/03/2023
by   Xiang Wang, et al.
0

Current state-of-the-art approaches for few-shot action recognition achieve promising performance by conducting frame-level matching on learned visual features. However, they generally suffer from two limitations: i) the matching procedure between local frames tends to be inaccurate due to the lack of guidance to force long-range temporal perception; ii) explicit motion learning is usually ignored, leading to partial information loss. To address these issues, we develop a Motion-augmented Long-short Contrastive Learning (MoLo) method that contains two crucial components, including a long-short contrastive objective and a motion autodecoder. Specifically, the long-short contrastive objective is to endow local frame features with long-form temporal awareness by maximizing their agreement with the global token of videos belonging to the same class. The motion autodecoder is a lightweight architecture to reconstruct pixel motions from the differential features, which explicitly embeds the network with motion dynamics. By this means, MoLo can simultaneously learn long-range temporal context and motion cues for comprehensive few-shot matching. To demonstrate the effectiveness, we evaluate MoLo on five standard benchmarks, and the results show that MoLo favorably outperforms recent advanced methods. The source code is available at https://github.com/alibaba-mmai-research/MoLo.

READ FULL TEXT

page 1

page 3

research
04/03/2020

TEA: Temporal Excitation and Aggregation for Action Recognition

Temporal modeling is key for action recognition in videos. It normally c...
research
03/06/2023

CLIP-guided Prototype Modulating for Few-shot Action Recognition

Learning from large-scale contrastive language-image pre-training like C...
research
07/20/2020

Hierarchical Contrastive Motion Learning for Video Action Recognition

One central question for video action recognition is how to model motion...
research
01/15/2021

Temporal-Relational CrossTransformers for Few-Shot Action Recognition

We propose a novel approach to few-shot action recognition, finding temp...
research
10/19/2021

LSTC: Boosting Atomic Action Detection with Long-Short-Term Context

In this paper, we place the atomic action detection problem into a Long-...
research
08/18/2023

Boosting Few-shot Action Recognition with Graph-guided Hybrid Matching

Class prototype construction and matching are core aspects of few-shot a...

Please sign up or login with your details

Forgot password? Click here to reset