Motion Sensitive Contrastive Learning for Self-supervised Video Representation

08/12/2022
by   Jingcheng Ni, et al.
0

Contrastive learning has shown great potential in video representation learning. However, existing approaches fail to sufficiently exploit short-term motion dynamics, which are crucial to various down-stream video understanding tasks. In this paper, we propose Motion Sensitive Contrastive Learning (MSCL) that injects the motion information captured by optical flows into RGB frames to strengthen feature learning. To achieve this, in addition to clip-level global contrastive learning, we develop Local Motion Contrastive Learning (LMCL) with frame-level contrastive objectives across the two modalities. Moreover, we introduce Flow Rotation Augmentation (FRA) to generate extra motion-shuffled negative samples and Motion Differential Sampling (MDS) to accurately screen training samples. Extensive experiments on standard benchmarks validate the effectiveness of the proposed method. With the commonly-used 3D ResNet-18 as the backbone, we achieve the top-1 accuracies of 91.5% on UCF101 and 50.3% on Something-Something v2 for video classification, and a 65.6% Top-1 Recall on UCF101 for video retrieval, notably improving the state-of-the-art.

READ FULL TEXT
research
01/11/2022

Motion-Focused Contrastive Learning of Video Representations

Motion, as the most distinct phenomenon in a video to involve the change...
research
12/21/2022

MoQuad: Motion-focused Quadruple Construction for Video Contrastive Learning

Learning effective motion features is an essential pursuit of video repr...
research
06/21/2023

Online Unsupervised Video Object Segmentation via Contrastive Motion Clustering

Online unsupervised video object segmentation (UVOS) uses the previous f...
research
06/18/2021

Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting

Instance-level contrastive learning techniques, which rely on data augme...
research
12/21/2020

Social NCE: Contrastive Learning of Socially-aware Motion Representations

Learning socially-aware motion representations is at the core of recent ...
research
12/07/2021

Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning

Despite the great progress in video understanding made by deep convoluti...
research
04/08/2022

Probabilistic Representations for Video Contrastive Learning

This paper presents Probabilistic Video Contrastive Learning, a self-sup...

Please sign up or login with your details

Forgot password? Click here to reset