Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture

01/19/2023
by   Mahmoud Assran, et al.
2

This paper demonstrates an approach for learning highly semantic image representations without relying on hand-crafted data-augmentations. We introduce the Image-based Joint-Embedding Predictive Architecture (I-JEPA), a non-generative approach for self-supervised learning from images. The idea behind I-JEPA is simple: from a single context block, predict the representations of various target blocks in the same image. A core design choice to guide I-JEPA towards producing semantic representations is the masking strategy; specifically, it is crucial to (a) predict several target blocks in the image, (b) sample target blocks with sufficiently large scale (occupying 15 (spatially distributed) context block. Empirically, when combined with Vision Transformers, we find I-JEPA to be highly scalable. For instance, we train a ViT-Huge/16 on ImageNet using 32 A100 GPUs in under 38 hours to achieve strong downstream performance across a wide range of tasks requiring various levels of abstraction, from linear classification to object counting and depth prediction.

READ FULL TEXT

page 4

page 8

page 17

research
04/14/2022

Masked Siamese Networks for Label-Efficient Learning

We propose Masked Siamese Networks (MSN), a self-supervised learning fra...
research
07/24/2023

MC-JEPA: A Joint-Embedding Predictive Architecture for Self-Supervised Learning of Motion and Content Features

Self-supervised learning of visual representations has been focusing on ...
research
06/09/2023

FLSL: Feature-level Self-supervised Learning

Current self-supervised learning (SSL) methods (e.g., SimCLR, DINO, VICR...
research
04/22/2023

Self-supervised Learning by View Synthesis

We present view-synthesis autoencoders (VSA) in this paper, which is a s...
research
06/22/2020

Don't Wait, Just Weight: Improving Unsupervised Representations by Learning Goal-Driven Instance Weights

In the absence of large labelled datasets, self-supervised learning tech...
research
11/12/2020

Discriminative, Generative and Self-Supervised Approaches for Target-Agnostic Learning

Supervised learning, characterized by both discriminative and generative...

Please sign up or login with your details

Forgot password? Click here to reset