Revisiting Discriminative Entropy Clustering and its relation to K-means

01/26/2023
by   Zhongwen Zhang, et al.
0

Maximization of mutual information between the model's input and output is formally related to "decisiveness" and "fairness" of the softmax predictions, motivating such unsupervised entropy-based losses for discriminative neural networks. Recent self-labeling methods based on such losses represent the state of the art in deep clustering. However, some important properties of entropy clustering are not well-known, or even misunderstood. For example, we provide a counterexample to prior claims about equivalence to variance clustering (K-means) and point out technical mistakes in such theories. We discuss the fundamental differences between these discriminative and generative clustering approaches. Moreover, we show the susceptibility of standard entropy clustering to narrow margins and motivate an explicit margin maximization term. We also propose an improved self-labeling loss; it is robust to pseudo-labeling errors and enforces stronger fairness. We develop an EM algorithm for our loss that is significantly faster than the standard alternatives. Our results improve the state-of-the-art on standard benchmarks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/13/2023

Collision Cross-entropy and EM Algorithm for Self-labeled Classification

We propose "collision cross-entropy" as a robust alternative to the Shan...
research
10/09/2018

Deep clustering: On the link between discriminative models and K-means

In the context of recent deep clustering studies, discriminative models ...
research
11/30/2021

The Devil is in the Margin: Margin-based Label Smoothing for Network Calibration

In spite of the dominant performances of deep neural networks, recent wo...
research
02/27/2020

GATCluster: Self-Supervised Gaussian-Attention Network for Image Clustering

Deep clustering has achieved state-of-the-art results via joint represen...
research
10/03/2019

Information based Deep Clustering: An experimental study

Recently, two methods have shown outstanding performance for clustering ...
research
12/03/2011

Information-Maximization Clustering based on Squared-Loss Mutual Information

Information-maximization clustering learns a probabilistic classifier in...
research
06/10/2019

HTDet: A Clustering Method using Information Entropy for Hardware Trojan Detection

Hardware Trojans (HTs) have drawn more and more attention in both academ...

Please sign up or login with your details

Forgot password? Click here to reset