Transformers Generalize DeepSets and Can be Extended to Graphs and Hypergraphs

10/27/2021
by   Jinwoo Kim, et al.
15

We present a generalization of Transformers to any-order permutation invariant data (sets, graphs, and hypergraphs). We begin by observing that Transformers generalize DeepSets, or first-order (set-input) permutation invariant MLPs. Then, based on recently characterized higher-order invariant MLPs, we extend the concept of self-attention to higher orders and propose higher-order Transformers for order-k data (k=2 for graphs and k>2 for hypergraphs). Unfortunately, higher-order Transformers turn out to have prohibitive complexity 𝒪(n^2k) to the number of input nodes n. To address this problem, we present sparse higher-order Transformers that have quadratic complexity to the number of input hyperedges, and further adopt the kernel attention approach to reduce the complexity to linear. In particular, we show that the sparse second-order Transformers with kernel attention are theoretically more expressive than message passing operations while having an asymptotically identical complexity. Our models achieve significant performance improvement over invariant MLPs and message-passing graph neural networks in large-scale graph regression and set-to-(hyper)graph prediction tasks. Our implementation is available at https://github.com/jw9730/hot.

READ FULL TEXT

page 4

page 5

page 7

page 11

page 13

page 15

page 18

page 21

research
03/10/2023

Exphormer: Sparse Transformers for Graphs

Graph transformers have emerged as a promising architecture for a variet...
research
06/19/2023

P-tensors: a General Formalism for Constructing Higher Order Message Passing Networks

Several recent papers have recently shown that higher order graph neural...
research
06/01/2022

Higher-Order Attention Networks

This paper introduces higher-order attention networks (HOANs), a novel c...
research
05/22/2023

Neural Functional Transformers

The recent success of neural networks as implicit representation of data...
research
02/23/2018

The Weighted Kendall and High-order Kernels for Permutations

We propose new positive definite kernels for permutations. First we intr...
research
07/06/2022

Pure Transformers are Powerful Graph Learners

We show that standard Transformers without graph-specific modifications ...
research
04/08/2020

The general theory of permutation equivarant neural networks and higher order graph variational encoders

Previous work on symmetric group equivariant neural networks generally o...

Please sign up or login with your details

Forgot password? Click here to reset