Deep Low-Density Separation for Semi-Supervised Classification

05/22/2022
by   Michael C. Burkhart, et al.
0

Given a small set of labeled data and a large set of unlabeled data, semi-supervised learning (SSL) attempts to leverage the location of the unlabeled datapoints in order to create a better classifier than could be obtained from supervised methods applied to the labeled training set alone. Effective SSL imposes structural assumptions on the data, e.g. that neighbors are more likely to share a classification or that the decision boundary lies in an area of low density. For complex and high-dimensional data, neural networks can learn feature embeddings to which traditional SSL methods can then be applied in what we call hybrid methods. Previously-developed hybrid methods iterate between refining a latent representation and performing graph-based SSL on this representation. In this paper, we introduce a novel hybrid method that instead applies low-density separation to the embedded features. We describe it in detail and discuss why low-density separation may be better suited for SSL on neural network-based embeddings than graph-based algorithms. We validate our method using in-house customer survey data and compare it to other state-of-the-art learning methods. Our approach effectively classifies thousands of unlabeled users from a relatively small number of hand-classified examples.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2018

Semi-Supervised Learning via Compact Latent Space Clustering

We present a novel cost function for semi-supervised learning of neural ...
research
12/04/2019

Large-Scale Semi-Supervised Learning via Graph Structure Learning over High-Dense Points

We focus on developing a novel scalable graph-based semi-supervised lear...
research
01/16/2019

The information-theoretic value of unlabeled data in semi-supervised learning

We quantify the separation between the numbers of labeled examples requi...
research
02/24/2022

Self-Training: A Survey

In recent years, semi-supervised algorithms have received a lot of inter...
research
05/30/2017

Semi-Supervised Learning for Detecting Human Trafficking

Human trafficking is one of the most atrocious crimes and among the chal...
research
03/18/2021

Data driven algorithms for limited labeled data learning

We consider a novel data driven approach for designing learning algorith...
research
01/15/2020

Two Cycle Learning: Clustering Based Regularisation for Deep Semi-Supervised Classification

This works addresses the challenge of classification with minimal annota...

Please sign up or login with your details

Forgot password? Click here to reset