ViewMix: Augmentation for Robust Representation in Self-Supervised Learning

by   Arjon Das, et al.

Joint Embedding Architecture-based self-supervised learning methods have attributed the composition of data augmentations as a crucial factor for their strong representation learning capabilities. While regional dropout strategies have proven to guide models to focus on lesser indicative parts of the objects in supervised methods, it hasn't been adopted by self-supervised methods for generating positive pairs. This is because the regional dropout methods are not suitable for the input sampling process of the self-supervised methodology. Whereas dropping informative pixels from the positive pairs can result in inefficient training, replacing patches of a specific object with a different one can steer the model from maximizing the agreement between different positive pairs. Moreover, joint embedding representation learning methods have not made robustness their primary training outcome. To this end, we propose the ViewMix augmentation policy, specially designed for self-supervised learning, upon generating different views of the same image, patches are cut and pasted from one view to another. By leveraging the different views created by this augmentation strategy, multiple joint embedding-based self-supervised methodologies obtained better localization capability and consistently outperformed their corresponding baseline methods. It is also demonstrated that incorporating ViewMix augmentation policy promotes robustness of the representations in the state-of-the-art methods. Furthermore, our experimentation and analysis of compute times suggest that ViewMix augmentation doesn't introduce any additional overhead compared to other counterparts.


page 2

page 9


Self-Supervised Speaker Verification with Simple Siamese Network and Self-Supervised Regularization

Training speaker-discriminative and robust speaker verification systems ...

Leveraging background augmentations to encourage semantic focus in self-supervised contrastive learning

Unsupervised representation learning is an important challenge in comput...

Directional Self-supervised Learning for Risky Image Augmentations

Only a few cherry-picked robust augmentation policies are beneficial to ...

Accelerating Self-Supervised Learning via Efficient Training Strategies

Recently the focus of the computer vision community has shifted from exp...

CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features

Regional dropout strategies have been proposed to enhance the performanc...

Self-supervised Learning of Image Embedding for Continuous Control

Operating directly from raw high dimensional sensory inputs like images ...

ViCE: Self-Supervised Visual Concept Embeddings as Contextual and Pixel Appearance Invariant Semantic Representations

This work presents a self-supervised method to learn dense semantically ...

Please sign up or login with your details

Forgot password? Click here to reset