Convex Covariate Clustering for Classification

03/05/2019
by   Daniel Andrade, et al.
0

Clustering, like covariate selection for classification, is an important step to understand and interpret the data. However, clustering of covariates is often performed independently of the classification step, which can lead to undesirable clustering results. Therefore, we propose a method that can cluster covariates while taking into account class label information of samples. We formulate the problem as a convex optimization problem which uses both, a-priori similarity information between covariates, and information from class-labeled samples. Like convex clustering [Chi and Lange, 2015], the proposed method offers a unique global minima making it insensitive to initialization. In order to solve the convex problem, we propose a specialized alternating direction method of multipliers (ADMM), which scales up to several thousands of variables. Furthermore, in order to circumvent computationally expensive cross-validation, we propose a model selection criterion based on approximate marginal likelihood estimation. Experiments on synthetic and real data confirm the usefulness of the proposed clustering method and the selection criterion.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/10/2023

Analyzing covariate clustering effects in healthcare cost subgroups: insights and applications for prediction

Healthcare cost prediction is a challenging task due to the high-dimensi...
research
06/22/2020

An Efficient Smoothing Proximal Gradient Algorithm for Convex Clustering

Cluster analysis organizes data into sensible groupings and is one of fu...
research
04/01/2013

Splitting Methods for Convex Clustering

Clustering is a fundamental problem in many scientific applications. Sta...
research
07/18/2014

Extensions of stability selection using subsamples of observations and covariates

We introduce extensions of stability selection, a method to stabilise va...
research
05/09/2015

Simultaneous Clustering and Model Selection for Multinomial Distribution: A Comparative Study

In this paper, we study different discrete data clustering methods, whic...
research
05/30/2019

Clustered Gaussian Graphical Model via Symmetric Convex Clustering

Knowledge of functional groupings of neurons can shed light on structure...
research
01/25/2017

A Convex Similarity Index for Sparse Recovery of Missing Image Samples

This paper investigates the problem of recovering missing samples using ...

Please sign up or login with your details

Forgot password? Click here to reset