Graph Construction for Learning with Unbalanced Data

12/11/2011
by   Jing Qian, et al.
0

Unbalanced data arises in many learning tasks such as clustering of multi-class data, hierarchical divisive clustering and semisupervised learning. Graph-based approaches are popular tools for these problems. Graph construction is an important aspect of graph-based learning. We show that graph-based algorithms can fail for unbalanced data for many popular graphs such as k-NN, ϵ-neighborhood and full-RBF graphs. We propose a novel graph construction technique that encodes global statistical information into node degrees through a ranking scheme. The rank of a data sample is an estimate of its p-value and is proportional to the total number of data samples with smaller density. This ranking scheme serves as a surrogate for density; can be reliably estimated; and indicates whether a data sample is close to valleys/modes. This rank-modulated degree(RMD) scheme is able to significantly sparsify the graph near valleys and provides an adaptive way to cope with unbalanced data. We then theoretically justify our method through limit cut analysis. Unsupervised and semi-supervised experiments on synthetic and real data sets demonstrate the superiority of our method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/07/2012

Graph-based Learning with Unbalanced Clusters

Graph construction is a crucial step in spectral clustering (SC) and gra...
research
02/20/2013

Spectral Clustering with Unbalanced Data

Spectral clustering (SC) and graph-based semi-supervised learning (SSL) ...
research
08/31/2020

Structured Graph Learning for Clustering and Semi-supervised Classification

Graphs have become increasingly popular in modeling structures and inter...
research
09/09/2013

Spectral Clustering with Imbalanced Data

Spectral clustering is sensitive to how graphs are constructed from data...
research
09/24/2019

Structured Graph Learning Via Laplacian Spectral Constraints

Learning a graph with a specific structure is essential for interpretabi...
research
11/06/2018

How Many Pairwise Preferences Do We Need to Rank A Graph Consistently?

We consider the problem of optimal recovery of true ranking of n items f...
research
07/30/2013

Scalable k-NN graph construction

The k-NN graph has played a central role in increasingly popular data-dr...

Please sign up or login with your details

Forgot password? Click here to reset