Real Elliptically Skewed Distributions and Their Application to Robust Cluster Analysis

06/30/2020
by   Christian A. Schroth, et al.
0

This article proposes a new class of Real Elliptically Skewed (RESK) distributions and associated clustering algorithms that allow for integrating robustness and skewness into a single unified cluster analysis framework. Non-symmetrically distributed and heavy-tailed data clusters have been reported in a variety of real-world applications. Robustness is essential because a few outlying observations can severely obscure the cluster structure. The RESK distributions are a generalization of the Real Elliptically Symmetric (RES) distributions. To estimate the cluster parameters and memberships, we derive an expectation maximization (EM) algorithm for arbitrary RESK distributions. Special attention is given to a new robust skew-Huber M-estimator, which is also the maximum likelihood estimator (MLE) for the skew-Huber distribution that belongs to the RESK class. Numerical experiments on simulated and real-world data confirm the usefulness of the proposed methods for skewed and heavy-tailed data sets.

READ FULL TEXT
research
05/04/2020

Robust M-Estimation Based Bayesian Cluster Enumeration for Real Elliptically Symmetric Distributions

Robustly determining the optimal number of clusters in a data set is an ...
research
02/15/2019

Heavy-tailed kernels reveal a finer cluster structure in t-SNE visualisations

T-distributed stochastic neighbour embedding (t-SNE) is a widely used da...
research
07/26/2021

Robust Regularized Locality Preserving Indexing for Fiedler Vector Estimation

The Fiedler vector of a connected graph is the eigenvector associated wi...
research
11/29/2018

Robust Bayesian Cluster Enumeration

A major challenge in cluster analysis is that the number of data cluster...
research
08/24/2021

A 2-stage elastic net algorithm for estimation of sparse networks with heavy tailed data

We propose a new 2-stage procedure that relies on the elastic net penalt...
research
08/25/2020

Regularization Methods Based on the L_q-Likelihood for Linear Models with Heavy-Tailed Errors

We propose regularization methods for linear models based on the L_q-lik...
research
04/03/2018

Grouped Heterogeneous Mixture Modeling for Clustered Data

Clustered data which has a grouping structure (e.g. postal area, school,...

Please sign up or login with your details

Forgot password? Click here to reset