Tubelet-Contrastive Self-Supervision for Video-Efficient Generalization

03/20/2023
by   Fida Mohammad Thoker, et al.
2

We propose a self-supervised method for learning motion-focused video representations. Existing approaches minimize distances between temporally augmented videos, which maintain high spatial similarity. We instead propose to learn similarities between videos with identical local motion dynamics but an otherwise different appearance. We do so by adding synthetic motion trajectories to videos which we refer to as tubelets. By simulating different tubelet motions and applying transformations, such as scaling and rotation, we introduce motion patterns beyond what is present in the pretraining data. This allows us to learn a video representation that is remarkably data-efficient: our approach maintains performance when using only 25 videos. Experiments on 10 diverse downstream settings demonstrate our competitive performance and generalizability to new domains and fine-grained actions.

READ FULL TEXT

page 1

page 4

page 7

page 15

research
12/07/2021

Time-Equivariant Contrastive Video Representation Learning

We introduce a novel self-supervised contrastive learning method to lear...
research
04/10/2022

Self-Supervised Video Representation Learning with Motion-Contrastive Perception

Visual-only self-supervised learning has achieved significant improvemen...
research
08/23/2023

MOFO: MOtion FOcused Self-Supervision for Video Understanding

Self-supervised learning (SSL) techniques have recently produced outstan...
research
06/25/2022

SLIC: Self-Supervised Learning with Iterative Clustering for Human Action Videos

Self-supervised methods have significantly closed the gap with end-to-en...
research
06/08/2023

Learning Fine-grained View-Invariant Representations from Unpaired Ego-Exo Videos via Temporal Alignment

The egocentric and exocentric viewpoints of a human activity look dramat...
research
07/21/2020

Video Representation Learning by Recognizing Temporal Transformations

We introduce a novel self-supervised learning approach to learn represen...
research
02/29/2020

First Order Motion Model for Image Animation

Image animation consists of generating a video sequence so that an objec...

Please sign up or login with your details

Forgot password? Click here to reset