Mixture of Conditional Gaussian Graphical Models for unlabelled heterogeneous populations in the presence of co-factors

06/19/2020
by   Thomas Lartigue, et al.
0

Conditional correlation networks, within Gaussian Graphical Models (GGM), are widely used to describe the direct interactions between the components of a random vector. In the case of an unlabelled Heterogeneous population, Expectation Maximisation (EM) algorithms for Mixtures of GGM have been proposed to estimate both each sub-population's graph and the class labels. However, we argue that, with most real data, class affiliation cannot be described with a Mixture of Gaussian, which mostly groups data points according to their geometrical proximity. In particular, there often exists external co-features whose values affect the features' average value, scattering across the feature space data points belonging to the same sub-population. Additionally, if the co-features' effect on the features is Heterogeneous, then the estimation of this effect cannot be separated from the sub-population identification. In this article, we propose a Mixture of Conditional GGM (CGGM) that subtracts the heterogeneous effects of the co-features to regroup the data points into sub-population corresponding clusters. We develop a penalised EM algorithm to estimate graph-sparse model parameters. We demonstrate on synthetic and real data how this method fulfils its goal and succeeds in identifying the sub-populations where the Mixtures of GGM are disrupted by the effect of the co-features.

READ FULL TEXT

page 18

page 22

research
12/31/2015

Nonparametric mixture of Gaussian graphical models

Graphical model has been widely used to investigate the complex dependen...
research
03/04/2012

Learning High-Dimensional Mixtures of Graphical Models

We consider unsupervised estimation of mixtures of discrete graphical mo...
research
05/07/2018

Learning Gene Regulatory Networks with High-Dimensional Heterogeneous Data

The Gaussian graphical model is a widely used tool for learning gene reg...
research
02/09/2023

Modeling and Forecasting COVID-19 Cases using Latent Subpopulations

Classical epidemiological models assume homogeneous populations. There h...
research
06/25/2007

Separating populations with wide data: A spectral analysis

In this paper, we consider the problem of partitioning a small data samp...
research
01/24/2022

Decentralized EM to Learn Gaussian Mixtures from Datasets Distributed by Features

Expectation Maximization (EM) is the standard method to learn Gaussian m...
research
04/14/2023

Finite mixtures in capture-recapture surveys for modelling residency patterns in marine wildlife populations

In this work, the goal is to estimate the abundance of an animal populat...

Please sign up or login with your details

Forgot password? Click here to reset