An Efficient Semismooth Newton Based Algorithm for Convex Clustering

02/20/2018
by   Yancheng Yuan, et al.
0

Clustering may be the most fundamental problem in unsupervised learning which is still active in machine learning research because its importance in many applications. Popular methods like K-means, may suffer from instability as they are prone to get stuck in its local minima. Recently, the sum-of-norms (SON) model (also known as clustering path), which is a convex relaxation of hierarchical clustering model, has been proposed in [7] and [5] Although numerical algorithms like ADMM and AMA are proposed to solve convex clustering model [2], it is known to be very challenging to solve large-scale problems. In this paper, we propose a semi-smooth Newton based augmented Lagrangian method for large-scale convex clustering problems. Extensive numerical experiments on both simulated and real data demonstrate that our algorithm is highly efficient and robust for solving large-scale problems. Moreover, the numerical results also show the superior performance and scalability of our algorithm compared to existing first-order methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/04/2018

Convex Clustering: Model, Theoretical Guarantee and Efficient Algorithm

Clustering is a fundamental problem in unsupervised learning. Popular me...
research
04/01/2013

Splitting Methods for Convex Clustering

Clustering is a fundamental problem in many scientific applications. Sta...
research
05/11/2021

A Euclidean Distance Matrix Model for Convex Clustering

Clustering has been one of the most basic and essential problems in unsu...
research
03/20/2020

A Graduated Filter Method for Large Scale Robust Estimation

Due to the highly non-convex nature of large-scale robust parameter esti...
research
12/15/2022

Variable Clustering via Distributionally Robust Nodewise Regression

We study a multi-factor block model for variable clustering and connect ...
research
02/08/2017

Clustering For Point Pattern Data

Clustering is one of the most common unsupervised learning tasks in mach...
research
11/24/2020

A New Algorithm for Convex Biclustering and Its Extension to the Compositional Data

Biclustering is a powerful data mining technique that allows simultaneou...

Please sign up or login with your details

Forgot password? Click here to reset