Correlation visualization under missing values: a comparison between imputation and direct parameter estimation methods

05/10/2023
by   Nhat-Hao Pham, et al.
0

Correlation matrix visualization is essential for understanding the relationships between variables in a dataset, but missing data can pose a significant challenge in estimating correlation coefficients. In this paper, we compare the effects of various missing data methods on the correlation plot, focusing on two common missing patterns: random and monotone. We aim to provide practical strategies and recommendations for researchers and practitioners in creating and analyzing the correlation plot. Our experimental results suggest that while imputation is commonly used for missing data, using imputed data for plotting the correlation matrix may lead to a significantly misleading inference of the relation between the features. We recommend using DPER, a direct parameter estimation approach, for plotting the correlation matrix based on its performance in the experiments.

READ FULL TEXT
research
06/26/2021

FCMI: Feature Correlation based Missing Data Imputation

Processed data are insightful, and crude data are obtuse. A serious thre...
research
06/06/2021

DPER: Efficient Parameter Estimation for Randomly Missing Data

The missing data problem has been broadly studied in the last few decade...
research
05/15/2022

Inference with Imputed Data: The Allure of Making Stuff Up

Incomplete observability of data generates an identification problem. Th...
research
11/24/2020

To Explore What Isn't There – Glyph-based Visualization for Analysis of Missing Values

This paper contributes a novel visualization method, Missingness Glyph, ...
research
04/08/2019

Multiple imputation in data that grow over time: A comparison of three strategies

Multiple imputation is a highly recommended technique to deal with missi...
research
10/05/2022

Estimating Aging Curves: Using Multiple Imputation to Examine Career Trajectories of MLB Offensive Players

In sports, an aging curve depicts the relationship between average perfo...

Please sign up or login with your details

Forgot password? Click here to reset