Handling Missing Data with Graph Representation Learning

10/30/2020
by   Jiaxuan You, et al.
12

Machine learning with missing data has been approached in two different ways, including feature imputation where missing feature values are estimated based on observed values, and label prediction where downstream labels are learned directly from incomplete data. However, existing imputation models tend to have strong prior assumptions and cannot learn from downstream tasks, while models targeting label prediction often involve heuristics and can encounter scalability issues. Here we propose GRAPE, a graph-based framework for feature imputation as well as label prediction. GRAPE tackles the missing data problem using a graph representation, where the observations and features are viewed as two types of nodes in a bipartite graph, and the observed feature values as edges. Under the GRAPE framework, the feature imputation is formulated as an edge-level prediction task and the label prediction as a node-level prediction task. These tasks are then solved with Graph Neural Networks. Experimental results on nine benchmark datasets show that GRAPE yields 20 absolute error for imputation tasks and 10 compared with existing state-of-the-art methods.

READ FULL TEXT
research
08/13/2022

GEDI: A Graph-based End-to-end Data Imputation Framework

Data imputation is an effective way to handle missing data, which is com...
research
12/06/2022

Data Imputation with Iterative Graph Reconstruction

Effective data imputation demands rich latent “structure" discovery capa...
research
10/19/2022

EGG-GAE: scalable graph neural networks for tabular data imputation

Missing data imputation (MDI) is crucial when dealing with tabular datas...
research
09/18/2023

Towards Better Modeling with Missing Data: A Contrastive Learning-based Visual Analytics Perspective

Missing data can pose a challenge for machine learning (ML) modeling. To...
research
07/05/2022

Data Integrity Error Localization in Networked Systems with Missing Data

Most recent network failure diagnosis systems focused on data center net...
research
04/28/2022

Coupling Deep Imputation with Multitask Learning for Downstream Tasks on Genomics Data

Genomics data such as RNA gene expression, methylation and micro RNA exp...
research
10/15/2022

Handling missing values in healthcare data: A systematic review of deep learning-based imputation techniques

Objective: The proper handling of missing values is critical to deliveri...

Please sign up or login with your details

Forgot password? Click here to reset