A semi-supervised sparse K-Means algorithm

03/16/2020
by   Avgoustinos Vouros, et al.
0

We consider the problem of data clustering with unidentified feature quality but the existence of small amount of label data. In the first case a sparse clustering method can be employed in order to detect the subgroup of features necessary for clustering and in the second case a semi-supervised method can use the labelled data to create constraints and enhance the clustering solution. In this paper we propose a K-Means inspired algorithm that employs these techniques. We show that the algorithm maintains the high performance of other similar semi-supervised algorthms as well as keeping the ability to identify informative from uninformative features. We examine the performance of the algorithm on real world data sets with unknown features quality as well as a real world data set with a known uninformative feature. We use a series of scenarios with different number and types of constraints.

READ FULL TEXT
research
09/26/2013

Determinantal Clustering Processes - A Nonparametric Bayesian Approach to Kernel Based Semi-Supervised Clustering

Semi-supervised clustering is the task of clustering data points into cl...
research
02/01/2016

Semi-supervised K-means++

Traditionally, practitioners initialize the k-means algorithm with cent...
research
05/20/2016

Fast Randomized Semi-Supervised Clustering

We consider the problem of clustering partially labeled data from a mini...
research
07/03/2022

An Empirical Evaluation of k-Means Coresets

Coresets are among the most popular paradigms for summarizing data. In p...
research
11/30/2021

An Exact Algorithm for Semi-supervised Minimum Sum-of-Squares Clustering

The minimum sum-of-squares clustering (MSSC), or k-means type clustering...
research
05/04/2017

Semi-supervised model-based clustering with controlled clusters leakage

In this paper, we focus on finding clusters in partially categorized dat...
research
07/11/2021

ReconVAT: A Semi-Supervised Automatic Music Transcription Framework for Low-Resource Real-World Data

Most of the current supervised automatic music transcription (AMT) model...

Please sign up or login with your details

Forgot password? Click here to reset