Semi-supervised K-means++

02/01/2016
by   Jordan Yoder, et al.
0

Traditionally, practitioners initialize the k-means algorithm with centers chosen uniformly at random. Randomized initialization with uneven weights ( k-means++) has recently been used to improve the performance over this strategy in cost and run-time. We consider the k-means problem with semi-supervised information, where some of the data are pre-labeled, and we seek to label the rest according to the minimum cost solution. By extending the k-means++ algorithm and analysis to account for the labels, we derive an improved theoretical bound on expected cost and observe improved performance in simulated and real data examples. This analysis provides theoretical justification for a roughly linear semi-supervised clustering algorithm.

READ FULL TEXT

page 10

page 11

research
03/16/2020

A semi-supervised sparse K-Means algorithm

We consider the problem of data clustering with unidentified feature qua...
research
10/05/2021

Quantum Semi-Supervised Learning with Quantum Supremacy

Quantum machine learning promises to efficiently solve important problem...
research
11/30/2021

An Exact Algorithm for Semi-supervised Minimum Sum-of-Squares Clustering

The minimum sum-of-squares clustering (MSSC), or k-means type clustering...
research
06/27/2012

A Simple Algorithm for Semi-supervised Learning with Improved Generalization Error Bound

In this work, we develop a simple algorithm for semi-supervised regressi...
research
04/30/2013

Semi-Supervised Information-Maximization Clustering

Semi-supervised clustering aims to introduce prior knowledge in the deci...
research
08/17/2018

Semi-Supervised Cluster Extraction via a Compressive Sensing Approach

We use techniques from compressive sensing to design a local clustering ...
research
10/31/2022

Improved Learning-augmented Algorithms for k-means and k-medians Clustering

We consider the problem of clustering in the learning-augmented setting,...

Please sign up or login with your details

Forgot password? Click here to reset