Representing Long-Range Context for Graph Neural Networks with Global Attention

by   Zhanghao Wu, et al.

Graph neural networks are powerful architectures for structured datasets. However, current methods struggle to represent long-range dependencies. Scaling the depth or width of GNNs is insufficient to broaden receptive fields as larger GNNs encounter optimization instabilities such as vanishing gradients and representation oversmoothing, while pooling-based approaches have yet to become as universally useful as in computer vision. In this work, we propose the use of Transformer-based self-attention to learn long-range pairwise relationships, with a novel "readout" mechanism to obtain a global graph embedding. Inspired by recent computer vision results that find position-invariant attention performant in learning long-range relationships, our method, which we call GraphTrans, applies a permutation-invariant Transformer module after a standard GNN module. This simple architecture leads to state-of-the-art results on several graph classification tasks, outperforming methods that explicitly encode graph structure. Our results suggest that purely-learning-based approaches without graph structure may be suitable for learning high-level, long-range relationships on graphs. Code for GraphTrans is available at



There are no comments yet.


page 1

page 2

page 3

page 4


Improving the Long-Range Performance of Gated Graph Neural Networks

Many popular variants of graph neural networks (GNNs) that are capable o...

Implicit Graph Neural Networks

Graph Neural Networks (GNNs) are widely used deep learning models that l...

EIGNN: Efficient Infinite-Depth Graph Neural Networks

Graph neural networks (GNNs) are widely used for modelling graph-structu...

Graph Neural Networks Inspired by Classical Iterative Algorithms

Despite the recent success of graph neural networks (GNN), common archit...

Geometry-aware Transformer for molecular property prediction

Recently, graph neural networks (GNNs) have achieved remarkable performa...

DOTIN: Dropping Task-Irrelevant Nodes for GNNs

Scalability is an important consideration for deep graph neural networks...

GraphiT: Encoding Graph Structure in Transformers

We show that viewing graphs as sets of node features and incorporating s...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.