More Than Meets The Eye: Semi-supervised Learning Under Non-IID Data

04/20/2021
by   Saul Calderon-Ramirez, et al.
1

A common heuristic in semi-supervised deep learning (SSDL) is to select unlabelled data based on a notion of semantic similarity to the labelled data. For example, labelled images of numbers should be paired with unlabelled images of numbers instead of, say, unlabelled images of cars. We refer to this practice as semantic data set matching. In this work, we demonstrate the limits of semantic data set matching. We show that it can sometimes even degrade the performance for a state of the art SSDL algorithm. We present and make available a comprehensive simulation sandbox, called non-IID-SSDL, for stress testing an SSDL algorithm under different degrees of distribution mismatch between the labelled and unlabelled data sets. In addition, we demonstrate that simple density based dissimilarity measures in the feature space of a generic classifier offer a promising and more reliable quantitative matching criterion to select unlabelled data before SSDL training.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/14/2020

MixMOOD: A systematic approach to class distribution mismatch in semi-supervised learning using deep dataset dissimilarity measures

In this work, we propose MixMOOD - a systematic approach to mitigate eff...
research
08/20/2019

More unlabelled data or label more data? A study on semi-supervised laparoscopic image segmentation

Improving a semi-supervised image segmentation task has the option of ad...
research
08/22/2020

Data Programming using Semi-Supervision and Subset Selection

The paradigm of data programming <cit.> has shown a lot of promise in us...
research
03/01/2022

Semi-supervised Deep Learning for Image Classification with Distribution Mismatch: A Survey

Deep learning methodologies have been employed in several different fiel...
research
11/06/2022

Learning to Annotate Part Segmentation with Gradient Matching

The success of state-of-the-art deep neural networks heavily relies on t...
research
12/06/2021

Organ localisation using supervised and semi supervised approaches combining reinforcement learning with imitation learning

Computer aided diagnostics often requires analysis of a region of intere...

Please sign up or login with your details

Forgot password? Click here to reset