High Dimensional Semiparametric Latent Graphical Model for Mixed Data

by   Jianqing Fan, et al.

Graphical models are commonly used tools for modeling multivariate random variables. While there exist many convenient multivariate distributions such as Gaussian distribution for continuous data, mixed data with the presence of discrete variables or a combination of both continuous and discrete variables poses new challenges in statistical modeling. In this paper, we propose a semiparametric model named latent Gaussian copula model for binary and mixed data. The observed binary data are assumed to be obtained by dichotomizing a latent variable satisfying the Gaussian copula distribution or the nonparanormal distribution. The latent Gaussian model with the assumption that the latent variables are multivariate Gaussian is a special case of the proposed model. A novel rank-based approach is proposed for both latent graph estimation and latent principal component analysis. Theoretically, the proposed methods achieve the same rates of convergence for both precision matrix estimation and eigenvector estimation, as if the latent variables were observed. Under similar conditions, the consistency of graph structure recovery and feature selection for leading eigenvectors is established. The performance of the proposed methods is numerically assessed through simulation studies, and the usage of our methods is illustrated by a genetic dataset.


page 1

page 2

page 3

page 4


Causal Clustering for 1-Factor Measurement Models on Data with Various Types

The tetrad constraint is a condition of which the satisfaction signals a...

Blessing of Dependence: Identifiability and Geometry of Discrete Models with Multiple Binary Latent Variables

Identifiability of discrete statistical models with latent variables is ...

Binary Independent Component Analysis via Non-stationarity

We consider independent component analysis of binary data. While fundame...

Adaptive probabilistic principal component analysis

Using the linear Gaussian latent variable model as a starting point we r...

Generalized Matrix Factorization

Unmeasured or latent variables are often the cause of correlations betwe...

Copula graphical models for heterogeneous mixed data

This article proposes a graphical model that can handle mixed-type, mult...

Neural Processes Mixed-Effect Models for Deep Normative Modeling of Clinical Neuroimaging Data

Normative modeling has recently been introduced as a promising approach ...

Please sign up or login with your details

Forgot password? Click here to reset