ScatterSample: Diversified Label Sampling for Data Efficient Graph Neural Network Learning

06/09/2022
by   Zhenwei Dai, et al.
24

What target labels are most effective for graph neural network (GNN) training? In some applications where GNNs excel-like drug design or fraud detection, labeling new instances is expensive. We develop a data-efficient active sampling framework, ScatterSample, to train GNNs under an active learning setting. ScatterSample employs a sampling module termed DiverseUncertainty to collect instances with large uncertainty from different regions of the sample space for labeling. To ensure diversification of the selected nodes, DiverseUncertainty clusters the high uncertainty nodes and selects the representative nodes from each cluster. Our ScatterSample algorithm is further supported by rigorous theoretical analysis demonstrating its advantage compared to standard active sampling methods that aim to simply maximize the uncertainty and not diversify the samples. In particular, we show that ScatterSample is able to efficiently reduce the model uncertainty over the whole sample space. Our experiments on five datasets show that ScatterSample significantly outperforms the other GNN active learning baselines, specifically it reduces the sampling cost by up to 50 accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/24/2020

Graph Policy Network for Transferable Active Learning on Graphs

Graph neural networks (GNNs) have been attracting increasing popularity ...
research
03/02/2022

Information Gain Propagation: a new way to Graph Active Learning with Soft Labels

Graph Neural Networks (GNNs) have achieved great success in various task...
research
10/30/2020

Deep Active Graph Representation Learning

Graph neural networks (GNNs) aim to learn graph representations that pre...
research
12/02/2022

SMARTQUERY: An Active Learning Framework for Graph Neural Networks through Hybrid Uncertainty Reduction

Graph neural networks have achieved significant success in representatio...
research
04/17/2020

Active Sentence Learning by Adversarial Uncertainty Sampling in Discrete Space

In this paper, we focus on reducing the labeled data size for sentence l...
research
12/01/2022

Uniform versus uncertainty sampling: When being active is less efficient than staying passive

It is widely believed that given the same labeling budget, active learni...
research
01/28/2023

Leveraging Importance Weights in Subset Selection

We present a subset selection algorithm designed to work with arbitrary ...

Please sign up or login with your details

Forgot password? Click here to reset