Offline Clustering Approach to Self-supervised Learning for Class-imbalanced Image Data

by   Hye-min Chang, et al.

Class-imbalanced datasets are known to cause the problem of model being biased towards the majority classes. In this project, we set up two research questions: 1) when is the class-imbalance problem more prevalent in self-supervised pre-training? and 2) can offline clustering of feature representations help pre-training on class-imbalanced data? Our experiments investigate the former question by adjusting the degree of class-imbalance when training the baseline models, namely SimCLR and SimSiam on CIFAR-10 database. To answer the latter question, we train each expert model on each subset of the feature clusters. We then distill the knowledge of expert models into a single model, so that we will be able to compare the performance of this model to our baselines.


page 1

page 2

page 3

page 4


Self-supervised Learning is More Robust to Dataset Imbalance

Self-supervised learning (SSL) is a scalable way to learn general visual...

Self-Supervised Pre-training of 3D Point Cloud Networks with Image Data

Reducing the quantity of annotations required for supervised training is...

Rethinking the Value of Labels for Improving Class-Imbalanced Learning

Real-world data often exhibits long-tailed distributions with heavy clas...

Semantic-enhanced Image Clustering

Image clustering is an important, and open challenge task in computer vi...

Learning From Long-Tailed Data With Noisy Labels

Class imbalance and noisy labels are the norm rather than the exception ...

Two-phase training mitigates class imbalance for camera trap image classification with CNNs

By leveraging deep learning to automatically classify camera trap images...

Complex Mixer for MedMNIST Classification Decathlon

With the development of the medical image field, researchers seek to dev...

Please sign up or login with your details

Forgot password? Click here to reset