Investigating Expressiveness of Transformer in Spectral Domain for Graphs

by   Anson Bastos, et al.

Transformers have been proven to be inadequate for graph representation learning. To understand this inadequacy, there is need to investigate if spectral analysis of transformer will reveal insights on its expressive power. Similar studies already established that spectral analysis of Graph neural networks (GNNs) provides extra perspectives on their expressiveness. In this work, we systematically study and prove the link between the spatial and spectral domain in the realm of the transformer. We further provide a theoretical analysis that the spatial attention mechanism in the transformer cannot effectively capture the desired frequency response, thus, inherently limiting its expressiveness in spectral space. Therefore, we propose FeTA, a framework that aims to perform attention over the entire graph spectrum analogous to the attention in spatial space. Empirical results suggest that FeTA provides homogeneous performance gain against vanilla transformer across all tasks on standard benchmarks and can easily be extended to GNN based models with low-pass characteristics (e.g., GAT). Furthermore, replacing the vanilla transformer model with FeTA in recently proposed position encoding schemes has resulted in comparable or better performance than transformer and GNN baselines.


page 1

page 2

page 3

page 4


Structure-Aware Transformer for Graph Representation Learning

The Transformer architecture has gained growing attention in graph repre...

How Powerful are Spectral Graph Neural Networks

Spectral Graph Neural Network is a kind of Graph Neural Network (GNN) ba...

Transformer and Snowball Graph Convolution Learning for Biomedical Graph Classification

Graph or network has been widely used for describing and modeling comple...

Transformer for Graphs: An Overview from Architecture Perspective

Recently, Transformer model, which has achieved great success in many ar...

NAR-Former V2: Rethinking Transformer for Universal Neural Network Representation Learning

As more deep learning models are being applied in real-world application...

PatchGT: Transformer over Non-trainable Clusters for Learning Graph Representations

Recently the Transformer structure has shown good performances in graph ...

Wind Park Power Prediction: Attention-Based Graph Networks and Deep Learning to Capture Wake Losses

With the increased penetration of wind energy into the power grid, it ha...

Please sign up or login with your details

Forgot password? Click here to reset