Cold PAWS: Unsupervised class discovery and the cold-start problem

05/17/2023
by   Evelyn J. Mannix, et al.
0

In many machine learning applications, labeling datasets can be an arduous and time-consuming task. Although research has shown that semi-supervised learning techniques can achieve high accuracy with very few labels within the field of computer vision, little attention has been given to how images within a dataset should be selected for labeling. In this paper, we propose a novel approach based on well-established self-supervised learning, clustering, and manifold learning techniques that address this challenge of selecting an informative image subset to label in the first instance, which is known as the cold-start or unsupervised selective labelling problem. We test our approach using several publicly available datasets, namely CIFAR10, Imagenette, DeepWeeds, and EuroSAT, and observe improved performance with both supervised and semi-supervised learning strategies when our label selection strategy is used, in comparison to random sampling. We also obtain superior performance for the datasets considered with a much simpler approach compared to other methods in the literature.

READ FULL TEXT
research
06/06/2019

Iterative Self-Learning: Semi-Supervised Improvement to Dataset Volumes and Model Accuracy

A novel semi-supervised learning technique is introduced based on a simp...
research
11/01/2016

Semi-Supervised Radio Signal Identification

Radio emitter recognition in dense multi-user environments is an importa...
research
10/29/2018

Unsupervised Data Selection for Supervised Learning

Recent research put a big effort in the development of deep learning arc...
research
04/26/2021

Unsupervised Instance Selection with Low-Label, Supervised Learning for Outlier Detection

The laborious process of labeling data often bottlenecks projects that a...
research
01/11/2023

Combining Self-labeling with Selective Sampling

Since data is the fuel that drives machine learning models, and access t...
research
04/26/2021

Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets

Data is the engine of modern computer vision, which necessitates collect...
research
03/17/2021

Semi-Supervised Learning for Eye Image Segmentation

Recent advances in appearance-based models have shown improved eye track...

Please sign up or login with your details

Forgot password? Click here to reset