Evolving Losses for Unsupervised Video Representation Learning

02/26/2020
by   AJ Piergiovanni, et al.
0

We present a new method to learn video representations from large-scale unlabeled video data. Ideally, this representation will be generic and transferable, directly usable for new tasks such as action recognition and zero or few-shot learning. We formulate unsupervised representation learning as a multi-modal, multi-task learning problem, where the representations are shared across different modalities via distillation. Further, we introduce the concept of loss function evolution by using an evolutionary search algorithm to automatically find optimal combination of loss functions capturing many (self-supervised) tasks and modalities. Thirdly, we propose an unsupervised representation evaluation metric using distribution matching to a large unlabeled dataset as a prior constraint, based on Zipf's law. This unsupervised constraint, which is not guided by any labeling, produces similar results to weakly-supervised, task-specific ones. The proposed unsupervised representation learning results in a single RGB network and outperforms previous methods. Notably, it is also more effective than several label-based methods (e.g., ImageNet), with the exception of large, fully labeled video datasets.

READ FULL TEXT

page 2

page 4

page 13

research
06/07/2019

Evolving Losses for Unlabeled Video Representation Learning

We present a new method to learn video representations from unlabeled da...
research
03/15/2023

MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge

Large scale Vision-Language (VL) models have shown tremendous success in...
research
08/29/2019

Metric-based Regularization and Temporal Ensemble for Multi-task Learning using Heterogeneous Unsupervised Tasks

One of the ways to improve the performance of a target task is to learn ...
research
08/03/2020

Memory-augmented Dense Predictive Coding for Video Representation Learning

The objective of this paper is self-supervised learning from video, in p...
research
03/10/2020

Learning Video Object Segmentation from Unlabeled Videos

We propose a new method for video object segmentation (VOS) that address...
research
02/11/2022

Investigating Power laws in Deep Representation Learning

Representation learning that leverages large-scale labelled datasets, is...
research
10/06/2020

Representation learning from videos in-the-wild: An object-centric approach

We propose a method to learn image representations from uncurated videos...

Please sign up or login with your details

Forgot password? Click here to reset