Bregman Power k-Means for Clustering Exponential Family Data

06/22/2022
by   Adithya Vellal, et al.
0

Recent progress in center-based clustering algorithms combats poor local minima by implicit annealing, using a family of generalized means. These methods are variations of Lloyd's celebrated k-means algorithm, and are most appropriate for spherical clusters such as those arising from Gaussian data. In this paper, we bridge these algorithmic advances to classical work on hard clustering under Bregman divergences, which enjoy a bijection to exponential family distributions and are thus well-suited for clustering objects arising from a breadth of data generating mechanisms. The elegant properties of Bregman divergences allow us to maintain closed form updates in a simple and transparent algorithm, and moreover lead to new theoretical arguments for establishing finite sample bounds that relax the bounded support assumption made in the existing state of the art. Additionally, we consider thorough empirical analyses on simulated experiments and a case study on rainfall data, finding that the proposed method outperforms existing peer methods in a variety of non-Gaussian data settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/12/2020

Kernel k-Means, By All Means: Algorithms and Strong Consistency

Kernel k-means clustering is a powerful tool for unsupervised learning o...
research
01/10/2020

Entropy Regularized Power k-Means Clustering

Despite its well-known shortcomings, k-means remains one of the most wid...
research
10/27/2021

Uniform Concentration Bounds toward a Unified Framework for Robust Clustering

Recent advances in center-based clustering continue to improve upon the ...
research
01/01/2019

Clustering with Distributed Data

We consider K-means clustering in networked environments (e.g., internet...
research
07/01/2021

q-Paths: Generalizing the Geometric Annealing Path using Power Means

Many common machine learning methods involve the geometric annealing pat...
research
10/26/2017

Energy Clustering

Energy statistics was proposed by Székely in the 80's inspired by the Ne...
research
08/04/2020

Biconvex Clustering

Convex clustering has recently garnered increasing interest due to its a...

Please sign up or login with your details

Forgot password? Click here to reset