A Robust and Flexible EM Algorithm for Mixtures of Elliptical Distributions with Missing Data

01/28/2022
by   Florian Mouret, et al.
0

This paper tackles the problem of missing data imputation for noisy and non-Gaussian data. A classical imputation method, the Expectation Maximization (EM) algorithm for Gaussian mixture models, has shown interesting properties when compared to other popular approaches such as those based on k-nearest neighbors or on multiple imputations by chained equations. However, Gaussian mixture models are known to be not robust to heterogeneous data, which can lead to poor estimation performance when the data is contaminated by outliers or come from a non-Gaussian distributions. To overcome this issue, a new expectation maximization algorithm is investigated for mixtures of elliptical distributions with the nice property of handling potential missing data. The complete-data likelihood associated with mixtures of elliptical distributions is well adapted to the EM framework thanks to its conditional distribution, which is shown to be a Student distribution. Experimental results on synthetic data demonstrate that the proposed algorithm is robust to outliers and can be used with non-Gaussian data. Furthermore, experiments conducted on real-world datasets show that this algorithm is very competitive when compared to other classical imputation methods.

READ FULL TEXT
research
09/16/2018

Semiparametric fractional imputation using Gaussian mixture models for handling multivariate missing data

Item nonresponse is frequently encountered in practice. Ignoring missing...
research
01/14/2022

Estimating Gaussian Copulas with Missing Data

In this work we present a rigorous application of the Expectation Maximi...
research
04/11/2020

Handling missing data in a neural network approach for the identification of charged particles in a multilayer detector

Identification of charged particles in a multilayer detector by the ener...
research
02/01/2018

Bootstrapping and Multiple Imputation Ensemble Approaches for Missing Data

Presence of missing values in a dataset can adversely affect the perform...
research
03/24/2021

Envelope Methods with Ignorable Missing Data

Envelope method was recently proposed as a method to reduce the dimensio...
research
07/20/2021

A Stochastic Version of the EM Algorithm for Mixture Cure Rate Model with Exponentiated Weibull Family of Lifetimes

Handling missing values plays an important role in the analysis of survi...

Please sign up or login with your details

Forgot password? Click here to reset