Masked Siamese Networks for Label-Efficient Learning

by   Mahmoud Assran, et al.

We propose Masked Siamese Networks (MSN), a self-supervised learning framework for learning image representations. Our approach matches the representation of an image view containing randomly masked patches to the representation of the original unmasked image. This self-supervised pre-training strategy is particularly scalable when applied to Vision Transformers since only the unmasked patches are processed by the network. As a result, MSNs improve the scalability of joint-embedding architectures, while producing representations of a high semantic level that perform competitively on low-shot image classification. For instance, on ImageNet-1K, with only 5,000 annotated images, our base MSN model achieves 72.4 of ImageNet-1K labels, we achieve 75.7 state-of-the-art for self-supervised learning on this benchmark. Our code is publicly available.


page 3

page 22

page 23

page 24

page 25

page 26


Self-supervised Learning for Sonar Image Classification

Self-supervised learning has proved to be a powerful approach to learn i...

Intra-Instance VICReg: Bag of Self-Supervised Image Patch Embedding

Recently, self-supervised learning (SSL) has achieved tremendous empiric...

Siamese Image Modeling for Self-Supervised Vision Representation Learning

Self-supervised learning (SSL) has delivered superior performance on a v...

Siamese Encoding and Alignment by Multiscale Learning with Self-Supervision

We propose a method of aligning a source image to a target image, where ...

Self Supervised Learning for Few Shot Hyperspectral Image Classification

Deep learning has proven to be a very effective approach for Hyperspectr...

Self-supervised pre-training enhances change detection in Sentinel-2 imagery

While annotated images for change detection using satellite imagery are ...

Efficient Self-supervised Vision Transformers for Representation Learning

This paper investigates two techniques for developing efficient self-sup...

Code Repositories


Masked Siamese Networks for Label-Efficient Learning

view repo