Semi-Supervised Graph Imbalanced Regression

05/20/2023
by   Gang Liu, et al.
0

Data imbalance is easily found in annotated data when the observations of certain continuous label values are difficult to collect for regression tasks. When they come to molecule and polymer property predictions, the annotated graph datasets are often small because labeling them requires expensive equipment and effort. To address the lack of examples of rare label values in graph regression tasks, we propose a semi-supervised framework to progressively balance training data and reduce model bias via self-training. The training data balance is achieved by (1) pseudo-labeling more graphs for under-represented labels with a novel regression confidence measurement and (2) augmenting graph examples in latent space for remaining rare labels after data balancing with pseudo-labels. The former is to identify quality examples from unlabeled data whose labels are confidently predicted and sample a subset of them with a reverse distribution from the imbalanced annotated data. The latter collaborates with the former to target a perfect balance using a novel label-anchored mixup algorithm. We perform experiments in seven regression tasks on graph datasets. Results demonstrate that the proposed framework significantly reduces the error of predicted graph properties, especially in under-represented label areas.

READ FULL TEXT
research
06/10/2021

Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced Semi-Supervised Learning

The capability of the traditional semi-supervised learning (SSL) methods...
research
07/28/2022

Learning to Adapt Classifier for Imbalanced Semi-supervised Learning

Pseudo-labeling has proven to be a promising semi-supervised learning (S...
research
01/20/2022

Informative Pseudo-Labeling for Graph Neural Networks with Few Labels

Graph Neural Networks (GNNs) have achieved state-of-the-art results for ...
research
02/17/2022

CLS: Cross Labeling Supervision for Semi-Supervised Learning

It is well known that the success of deep neural networks is greatly att...
research
01/15/2021

In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label Selection Framework for Semi-Supervised Learning

The recent research in semi-supervised learning (SSL) is mostly dominate...
research
09/13/2021

POPCORN: Progressive Pseudo-labeling with Consistency Regularization and Neighboring

Semi-supervised learning (SSL) uses unlabeled data to compensate for the...
research
05/12/2021

Disentangling Sampling and Labeling Bias for Learning in Large-Output Spaces

Negative sampling schemes enable efficient training given a large number...

Please sign up or login with your details

Forgot password? Click here to reset