DiffusAL: Coupling Active Learning with Graph Diffusion for Label-Efficient Node Classification

by   Sandra Gilhuber, et al.
Siemens AG

Node classification is one of the core tasks on attributed graphs, but successful graph learning solutions require sufficiently labeled data. To keep annotation costs low, active graph learning focuses on selecting the most qualitative subset of nodes that maximizes label efficiency. However, deciding which heuristic is best suited for an unlabeled graph to increase label efficiency is a persistent challenge. Existing solutions either neglect aligning the learned model and the sampling method or focus only on limited selection aspects. They are thus sometimes worse or only equally good as random sampling. In this work, we introduce a novel active graph learning approach called DiffusAL, showing significant robustness in diverse settings. Toward better transferability between different graph structures, we combine three independent scoring functions to identify the most informative node samples for labeling in a parameter-free way: i) Model Uncertainty, ii) Diversity Component, and iii) Node Importance computed via graph diffusion heuristics. Most of our calculations for acquisition and training can be pre-processed, making DiffusAL more efficient compared to approaches combining diverse selection criteria and similarly fast as simpler heuristics. Our experiments on various benchmark datasets show that, unlike previous methods, our approach significantly outperforms random selection in 100 budgets tested.


Active Learning for Graph Embedding

Graph embedding provides an efficient solution for graph analysis by con...

Dissimilar Nodes Improve Graph Active Learning

Training labels for graph embedding algorithms could be costly to obtain...

Pool-Based Sequential Active Learning for Regression

Active learning is a machine learning approach for reducing the data lab...

Data-adaptive Active Sampling for Efficient Graph-Cognizant Classification

The present work deals with active sampling of graph nodes representing ...

Active Learning on Attributed Graphs via Graph Cognizant Logistic Regression and Preemptive Query Generation

Node classification in attributed graphs is an important task in multipl...

Active Learning for Regression Using Greedy Sampling

Regression problems are pervasive in real-world applications. Generally ...

Diverse Complexity Measures for Dataset Curation in Self-driving

Modern self-driving autonomy systems heavily rely on deep learning. As a...

Please sign up or login with your details

Forgot password? Click here to reset