Multi-task Self-Supervised Visual Learning

08/25/2017
by   Carl Doersch, et al.
0

We investigate methods for combining multiple self-supervised tasks--i.e., supervised tasks where data can be collected without manual labeling--in order to train a single visual representation. First, we provide an apples-to-apples comparison of four different self-supervised tasks using the very deep ResNet-101 architecture. We then combine tasks to jointly train a network. We also explore lasso regularization to encourage the network to factorize the information in its representation, and methods for "harmonizing" network inputs in order to learn a more unified representation. We evaluate all methods on ImageNet classification, PASCAL VOC detection, and NYU depth prediction. Our results show that deeper networks work better, and that combining tasks--even via a naive multi-head architecture--always improves performance. Our best joint network nearly matches the PASCAL performance of a model pre-trained on ImageNet classification, and matches the ImageNet network on NYU depth prediction.

READ FULL TEXT
research
08/15/2019

Multi-Task Self-Supervised Learning for Disfluency Detection

Most existing approaches to disfluency detection heavily rely on human-a...
research
11/26/2020

How Well Do Self-Supervised Models Transfer?

Self-supervised visual representation learning has seen huge progress in...
research
02/06/2018

Learning Image Representations by Completing Damaged Jigsaw Puzzles

In this paper, we explore methods of complicating self-supervised tasks ...
research
11/17/2017

Improvements to context based self-supervised learning

We develop a set of methods to improve on the results of self-supervised...
research
05/13/2021

Using Self-Supervised Co-Training to Improve Facial Representation

In this paper, at first, the impact of ImageNet pre-training on Facial E...
research
01/02/2023

ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders

Driven by improved architectures and better representation learning fram...
research
06/03/2022

Learning an Adaptation Function to Assess Image Visual Similarities

Human perception is routinely assessing the similarity between images, b...

Please sign up or login with your details

Forgot password? Click here to reset