Imputation of missing data using multivariate Gaussian Linear Cluster-Weighted Modeling

Missing data arises when certain values are not recorded or observed for variables of interest. However, most of the statistical theory assume complete data availability. To address incomplete databases, one approach is to fill the gaps corresponding to the missing information based on specific criteria, known as imputation. In this study, we propose a novel imputation methodology for databases with non-response units by leveraging additional information from fully observed auxiliary variables. We assume that the variables included in the database are continuous and that the auxiliary variables, which are fully observed, help to improve the imputation capacity of the model. Within a fully Bayesian framework, our method utilizes a flexible mixture of multivariate normal distributions to jointly model the response and auxiliary variables. By employing the principles of Gaussian Cluster-Weighted modeling, we construct a predictive model to impute the missing values by leveraging information from the covariates. We present simulation studies and a real data illustration to demonstrate the imputation capacity of our method across various scenarios, comparing it to other methods in the literature

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/24/2021

Imputation of Missing Data Using Linear Gaussian Cluster-Weighted Modeling

Missing data theory deals with the statistical methods in the occurrence...
research
02/23/2023

IlocA: An algorithm to Cluster Cells and form Imputation Groups from a pair of Classification Variables

We set out the novel bottom up procedure to aggregate or cluster cells w...
research
07/12/2021

Choosing Imputation Models

Imputing missing values is an important preprocessing step in data analy...
research
03/09/2022

A-Optimal Split Questionnaire Designs for Multivariate Continuous Variables

A split questionnaire design (SQD), an alternative to full questionnaire...
research
11/09/2018

What does it mean for data to be `observed' or `missing'?

In statistical modelling of incomplete data, missingness is encoded as a...
research
08/27/2022

Graphical and numerical diagnostic tools to assess multiple imputation models by posterior predictive checking

Missing data are often dealt with multiple imputation. A crucial part of...
research
02/16/2019

Sequentially additive nonignorable missing data modeling using auxiliary marginal information

We study a class of missingness mechanisms, called sequentially additive...

Please sign up or login with your details

Forgot password? Click here to reset