Semi-supervised learning for structured regression on partially observed attributed graphs

03/28/2018
by   Jelena Stojanovic, et al.
0

Conditional probabilistic graphical models provide a powerful framework for structured regression in spatio-temporal datasets with complex correlation patterns. However, in real-life applications a large fraction of observations is often missing, which can severely limit the representational power of these models. In this paper we propose a Marginalized Gaussian Conditional Random Fields (m-GCRF) structured regression model for dealing with missing labels in partially observed temporal attributed graphs. This method is aimed at learning with both labeled and unlabeled parts and effectively predicting future values in a graph. The method is even capable of learning from nodes for which the response variable is never observed in history, which poses problems for many state-of-the-art models that can handle missing data. The proposed model is characterized for various missingness mechanisms on 500 synthetic graphs. The benefits of the new method are also demonstrated on a challenging application for predicting precipitation based on partial observations of climate variables in a temporal graph that spans the entire continental US. We also show that the method can be useful for optimizing the costs of data collection in climate applications via active reduction of the number of weather stations to consider. In experiments on these real-world and synthetic datasets we show that the proposed model is consistently more accurate than alternative semi-supervised structured models, as well as models that either use imputation to deal with missing values or simply ignore them altogether.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/03/2021

Semi-supervised Conditional Density Estimation for Imputation and Classification of Incomplete Instances

Incomplete instances with various missing attributes in many real-world ...
research
08/15/2023

Semi-Supervised Learning with Multiple Imputations on Non-Random Missing Labels

Semi-Supervised Learning (SSL) is implemented when algorithms are traine...
research
02/15/2023

Are labels informative in semi-supervised learning? – Estimating and leveraging the missing-data mechanism

Semi-supervised learning is a powerful technique for leveraging unlabele...
research
06/09/2022

Accurate Node Feature Estimation with Structured Variational Graph Autoencoder

Given a graph with partial observations of node features, how can we est...
research
11/26/2022

Multiple imputation for logistic regression models: incorporating an interaction

Background: Multiple imputation is often used to reduce bias and gain ef...
research
09/03/2015

Semi-described and semi-supervised learning with Gaussian processes

Propagating input uncertainty through non-linear Gaussian process (GP) m...
research
09/02/2019

Further results on structured regression for multi-scale networks

Gaussian Conditional Random Fields (GCRF), as a structured regression mo...

Please sign up or login with your details

Forgot password? Click here to reset