Self-Supervised Video Representation Learning by Video Incoherence Detection

09/26/2021
by   Haozhi Cao, et al.
0

This paper introduces a novel self-supervised method that leverages incoherence detection for video representation learning. It roots from the observation that visual systems of human beings can easily identify video incoherence based on their comprehensive understanding of videos. Specifically, the training sample, denoted as the incoherent clip, is constructed by multiple sub-clips hierarchically sampled from the same raw video with various lengths of incoherence between each other. The network is trained to learn high-level representation by predicting the location and length of incoherence given the incoherent clip as input. Additionally, intra-video contrastive learning is introduced to maximize the mutual information between incoherent clips from the same raw video. We evaluate our proposed method through extensive experiments on action recognition and video retrieval utilizing various backbone networks. Experiments show that our proposed method achieves state-of-the-art performance across different backbone networks and different datasets compared with previous coherence-based methods.

READ FULL TEXT

page 1

page 8

page 9

research
08/13/2020

Self-supervised Video Representation Learning by Pace Prediction

This paper addresses the problem of self-supervised video representation...
research
12/11/2021

Self-supervised Spatiotemporal Representation Learning by Exploiting Video Continuity

Recent self-supervised video representation learning methods have found ...
research
04/08/2022

Probabilistic Representations for Video Contrastive Learning

This paper presents Probabilistic Video Contrastive Learning, a self-sup...
research
03/05/2020

Self-Supervised Spatio-Temporal Representation Learning Using Variable Playback Speed Prediction

We propose a self-supervised learning method by predicting the variable ...
research
07/26/2022

Static and Dynamic Concepts for Self-supervised Video Representation Learning

In this paper, we propose a novel learning scheme for self-supervised vi...
research
08/31/2020

Self-supervised Video Representation Learning by Uncovering Spatio-temporal Statistics

This paper proposes a novel pretext task to address the self-supervised ...
research
05/10/2023

Self-Supervised Video Representation Learning via Latent Time Navigation

Self-supervised video representation learning aimed at maximizing simila...

Please sign up or login with your details

Forgot password? Click here to reset