Hybrid BYOL-ViT: Efficient approach to deal with small datasets

11/08/2021
by   Safwen Naimi, et al.
0

Supervised learning can learn large representational spaces, which are crucial for handling difficult learning tasks. However, due to the design of the model, classical image classification approaches struggle to generalize to new problems and new situations when dealing with small datasets. In fact, supervised learning can lose the location of image features which leads to supervision collapse in very deep architectures. In this paper, we investigate how self-supervision with strong and sufficient augmentation of unlabeled data can train effectively the first layers of a neural network even better than supervised learning, with no need for millions of labeled data. The main goal is to disconnect pixel data from annotation by getting generic task-agnostic low-level features. Furthermore, we look into Vision Transformers (ViT) and show that the low-level features derived from a self-supervised architecture can improve the robustness and the overall performance of this emergent architecture. We evaluated our method on one of the smallest open-source datasets STL-10 and we obtained a significant boost of performance from 41.66 to 83.25 architecture to the ViT instead of the raw images.

READ FULL TEXT

page 4

page 7

page 17

research
10/19/2019

Label-efficient audio classification through multitask learning and self-supervision

While deep learning has been incredibly successful in modeling tasks wit...
research
07/12/2022

Contrastive Deep Supervision

The success of deep learning is usually accompanied by the growth in neu...
research
11/03/2021

Recent Advancements in Self-Supervised Paradigms for Visual Feature Representation

We witnessed a massive growth in the supervised learning paradigm in the...
research
04/02/2022

Mix-up Self-Supervised Learning for Contrast-agnostic Applications

Contrastive self-supervised learning has attracted significant research ...
research
03/06/2021

Imbalance-Aware Self-Supervised Learning for 3D Radiomic Representations

Radiomic representations can quantify properties of regions of interest ...
research
07/04/2022

Masked Self-Supervision for Remaining Useful Lifetime Prediction in Machine Tools

Prediction of Remaining Useful Lifetime(RUL) in the modern manufacturing...
research
07/05/2023

Source Identification: A Self-Supervision Task for Dense Prediction

The paradigm of self-supervision focuses on representation learning from...

Please sign up or login with your details

Forgot password? Click here to reset