Deep Clustering with Incomplete Noisy Pairwise Annotations: A Geometric Regularization Approach

05/30/2023
by   Tri Nguyen, et al.
0

The recent integration of deep learning and pairwise similarity annotation-based constrained clustering – i.e., deep constrained clustering (DCC) – has proven effective for incorporating weak supervision into massive data clustering: Less than 1 often substantially enhance the clustering accuracy. However, beyond empirical successes, there is a lack of understanding of DCC. In addition, many DCC paradigms are sensitive to annotation noise, but performance-guaranteed noisy DCC methods have been largely elusive. This work first takes a deep look into a recently emerged logistic loss function of DCC, and characterizes its theoretical properties. Our result shows that the logistic DCC loss ensures the identifiability of data membership under reasonable conditions, which may shed light on its effectiveness in practice. Building upon this understanding, a new loss function based on geometric factor analysis is proposed to fend against noisy annotations. It is shown that even under unknown annotation confusions, the data membership can still be provably identified under our proposed learning criterion. The proposed approach is tested over multiple datasets to validate our claims.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/04/2023

RoLNiP: Robust Learning Using Noisy Pairwise Comparisons

This paper presents a robust approach for learning from noisy pairwise c...
research
02/13/2019

Deep Divergence-Based Approach to Clustering

A promising direction in deep learning research consists in learning rep...
research
08/14/2019

AutoCorrect: Deep Inductive Alignment of Noisy Geometric Annotations

We propose AutoCorrect, a method to automatically learn object-annotatio...
research
11/28/2017

Learning to cluster in order to Transfer across domains and tasks

This paper introduces a novel method to perform transfer learning across...
research
07/16/2019

The Bregman-Tweedie Classification Model

This work proposes the Bregman-Tweedie classification model and analyzes...
research
04/05/2021

Semi-Supervised Clustering with Inaccurate Pairwise Annotations

Pairwise relational information is a useful way of providing partial sup...
research
06/14/2021

Crowdsourcing via Annotator Co-occurrence Imputation and Provable Symmetric Nonnegative Matrix Factorization

Unsupervised learning of the Dawid-Skene (D S) model from noisy, incom...

Please sign up or login with your details

Forgot password? Click here to reset