MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments

07/18/2023
by   Spyros Gidaris, et al.
0

Self-supervised learning can be used for mitigating the greedy needs of Vision Transformer networks for very large fully-annotated datasets. Different classes of self-supervised learning offer representations with either good contextual reasoning properties, e.g., using masked image modeling strategies, or invariance to image perturbations, e.g., with contrastive methods. In this work, we propose a single-stage and standalone method, MOCA, which unifies both desired properties using novel mask-and-predict objectives defined with high-level features (instead of pixel-level details). Moreover, we show how to effectively employ both learning paradigms in a synergistic and computation-efficient way. Doing so, we achieve new state-of-the-art results on low-shot settings and strong experimental results in various evaluation protocols with a training that is at least 3 times faster than prior methods.

READ FULL TEXT
research
07/08/2022

Pixel-level Correspondence for Self-Supervised Learning from Video

While self-supervised learning has enabled effective representation lear...
research
06/10/2022

Exploring Feature Self-relation for Self-supervised Transformer

Learning representations with self-supervision for convolutional network...
research
03/08/2022

CaSS: A Channel-aware Self-supervised Representation Learning Framework for Multivariate Time Series Classification

Self-supervised representation learning of Multivariate Time Series (MTS...
research
08/18/2021

Self-Supervised Visual Representations Learning by Contrastive Mask Prediction

Advanced self-supervised visual representation learning methods rely on ...
research
02/16/2022

Self-Supervised Representation Learning via Latent Graph Prediction

Self-supervised learning (SSL) of graph neural networks is emerging as a...
research
03/07/2022

Comparing representations of biological data learned with different AI paradigms, augmenting and cropping strategies

Recent advances in computer vision and robotics enabled automated large-...
research
06/13/2020

Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning

We introduce Bootstrap Your Own Latent (BYOL), a new approach to self-su...

Please sign up or login with your details

Forgot password? Click here to reset