Discriminative Similarity for Data Clustering

09/17/2021
by   Yingzhen Yang, et al.
18

Similarity-based clustering methods separate data into clusters according to the pairwise similarity between the data, and the pairwise similarity is crucial for their performance. In this paper, we propose Clustering by Discriminative Similarity (CDS), a novel method which learns discriminative similarity for data clustering. CDS learns an unsupervised similarity-based classifier from each data partition, and searches for the optimal partition of the data by minimizing the generalization error of the learnt classifiers associated with the data partitions. By generalization analysis via Rademacher complexity, the generalization error bound for the unsupervised similarity-based classifier is expressed as the sum of discriminative similarity between the data from different classes. It is proved that the derived discriminative similarity can also be induced by the integrated squared error bound for kernel density classification. In order to evaluate the performance of the proposed discriminative similarity, we propose a new clustering method using a kernel as the similarity function, CDS via unsupervised kernel classification (CDSK), with its effectiveness demonstrated by experimental results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/05/2017

Discriminative Similarity for Clustering and Semi-Supervised Learning

Similarity-based clustering and semi-supervised learning methods separat...
research
10/02/2012

Nonparametric Unsupervised Classification

Unsupervised classification methods learn a discriminative classifier fr...
research
05/21/2019

Clustering with Similarity Preserving

Graph-based clustering has shown promising performance in many tasks. A ...
research
05/10/2019

Integrating Tensor Similarity to Enhance Clustering Performance

Clustering aims to separate observed data into different categories. The...
research
11/29/2012

Overlapping clustering based on kernel similarity metric

Producing overlapping schemes is a major issue in clustering. Recent pro...
research
03/26/2018

Similarity based hierarchical clustering of physiological parameters for the identification of health states - a feasibility study

This paper introduces a new unsupervised method for the clustering of ph...
research
08/07/2021

Clustering Large Data Sets with Incremental Estimation of Low-density Separating Hyperplanes

An efficient method for obtaining low-density hyperplane separators in t...

Please sign up or login with your details

Forgot password? Click here to reset