Sparse model-based clustering of three-way data via lasso-type penalties

07/20/2023
by   Andrea Cappozzo, et al.
0

Mixtures of matrix Gaussian distributions provide a probabilistic framework for clustering continuous matrix-variate data, which are becoming increasingly prevalent in various fields. Despite its widespread adoption and successful application, this approach suffers from over-parameterization issues, making it less suitable even for matrix-variate data of moderate size. To overcome this drawback, we introduce a sparse model-based clustering approach for three-way data. Our approach assumes that the matrix mixture parameters are sparse and have different degree of sparsity across clusters, allowing to induce parsimony in a flexible manner. Estimation of the model relies on the maximization of a penalized likelihood, with specifically tailored group and graphical lasso penalties. These penalties enable the selection of the most informative features for clustering three-way data where variables are recorded over multiple occasions and allow to capture cluster-specific association structures. The proposed methodology is tested extensively on synthetic data and its validity is demonstrated in application to time-dependent crime patterns in different US cities.

READ FULL TEXT

page 21

page 22

page 23

page 27

page 31

page 34

research
11/21/2017

Model-based Clustering with Sparse Covariance Matrices

Finite Gaussian mixture models are widely used for model-based clusterin...
research
05/17/2021

Group-wise shrinkage for multiclass Gaussian Graphical Models

Gaussian Graphical Models are widely employed for modelling dependence a...
research
03/12/2019

Flexible Clustering with a Sparse Mixture of Generalized Hyperbolic Distributions

Robust clustering of high-dimensional data is an important topic because...
research
07/05/2018

Model-based Clustering

Mixture models extend the toolbox of clustering methods available to the...
research
07/22/2018

Finite mixtures of matrix-variate Poisson-log normal distributions for three-way count data

Three-way data structures, characterized by three entities, the units, t...
research
11/29/2021

Model-based clustering via skewed matrix-variate cluster-weighted models

Cluster-weighted models (CWMs) extend finite mixtures of regressions (FM...
research
04/29/2022

greed: An R Package for Model-Based Clustering by Greedy Maximization of the Integrated Classification Likelihood

The greed package implements the general and flexible framework of arXiv...

Please sign up or login with your details

Forgot password? Click here to reset