ViewCLR: Learning Self-supervised Video Representation for Unseen Viewpoints

12/07/2021
by   Srijan Das, et al.
0

Learning self-supervised video representation predominantly focuses on discriminating instances generated from simple data augmentation schemes. However, the learned representation often fails to generalize over unseen camera viewpoints. To this end, we propose ViewCLR, that learns self-supervised video representation invariant to camera viewpoint changes. We introduce a view-generator that can be considered as a learnable augmentation for any self-supervised pre-text tasks, to generate latent viewpoint representation of a video. ViewCLR maximizes the similarities between the latent viewpoint representation with its representation from the original viewpoint, enabling the learned video encoder to generalize over unseen camera viewpoints. Experiments on cross-view benchmark datasets including NTU RGB+D dataset show that ViewCLR stands as a state-of-the-art viewpoint invariant self-supervised method.

READ FULL TEXT

page 1

page 13

research
05/24/2022

Multi-Augmentation for Efficient Visual Representation Learning for Self-supervised Pre-training

In recent years, self-supervised learning has been studied to deal with ...
research
04/16/2019

What I See Is What You See: Joint Attention Learning for First and Third Person Video Co-analysis

In recent years, more and more videos are captured from the first-person...
research
08/30/2021

Equine Pain Behavior Classification via Self-Supervised Disentangled Pose Representation

Timely detection of horse pain is important for equine welfare. Horses e...
research
06/08/2021

Self-Supervised Learning with Data Augmentations Provably Isolates Content from Style

Self-supervised representation learning has shown remarkable success in ...
research
12/16/2021

High Fidelity Visualization of What Your Self-Supervised Representation Knows About

Discovering what is learned by neural networks remains a challenge. In s...
research
12/17/2021

Watermarking Images in Self-Supervised Latent Spaces

We revisit watermarking techniques based on pre-trained deep networks, i...
research
06/07/2021

Novel View Video Prediction Using a Dual Representation

We address the problem of novel view video prediction; given a set of in...

Please sign up or login with your details

Forgot password? Click here to reset