Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles

03/30/2016
by   Mehdi Noroozi, et al.
0

In this paper we study the problem of image representation learning without human annotation. By following the principles of self-supervision, we build a convolutional neural network (CNN) that can be trained to solve Jigsaw puzzles as a pretext task, which requires no manual labeling, and then later repurposed to solve object classification and detection. To maintain the compatibility across tasks we introduce the context-free network (CFN), a siamese-ennead CNN. The CFN takes image tiles as input and explicitly limits the receptive field (or context) of its early processing units to one tile at a time. We show that the CFN includes fewer parameters than AlexNet while preserving the same semantic learning capabilities. By training the CFN to solve Jigsaw puzzles, we learn both a feature mapping of object parts as well as their correct spatial arrangement. Our experimental evaluations show that the learned features capture semantically relevant content. Our proposed method for learning visual representations outperforms state of the art methods in several transfer learning benchmarks.

READ FULL TEXT

page 2

page 5

page 11

page 14

research
05/04/2015

Unsupervised Learning of Visual Representations using Videos

Is strong supervision necessary for learning a good visual representatio...
research
12/02/2018

Iterative Reorganization with Weak Spatial Constraints: Solving Arbitrary Jigsaw Puzzles for Unsupervised Representation Learning

Learning visual features from unlabeled image data is an important yet c...
research
12/27/2017

Learning More Universal Representations for Transfer-Learning

Transfer learning is commonly used to address the problem of the prohibi...
research
07/29/2019

Modulation of early visual processing alleviates capacity limits in solving multiple tasks

In daily life situations, we have to perform multiple tasks given a visu...
research
08/22/2017

Representation Learning by Learning to Count

We introduce a novel method for representation learning that uses an art...
research
12/01/2016

Object-Centric Representation Learning from Unlabeled Videos

Supervised (pre-)training currently yields state-of-the-art performance ...
research
12/21/2020

Image Annotation based on Deep Hierarchical Context Networks

Context modeling is one of the most fertile subfields of visual recognit...

Please sign up or login with your details

Forgot password? Click here to reset