DeepAI AI Chat
Log In Sign Up

Systematic Generalization with Edge Transformers

by   Leon Bergen, et al.

Recent research suggests that systematic generalization in natural language understanding remains a challenge for state-of-the-art neural models such as Transformers and Graph Neural Networks. To tackle this challenge, we propose Edge Transformer, a new model that combines inspiration from Transformers and rule-based symbolic AI. The first key idea in Edge Transformers is to associate vector states with every edge, that is, with every pair of input nodes – as opposed to just every node, as it is done in the Transformer model. The second major innovation is a triangular attention mechanism that updates edge representations in a way that is inspired by unification from logic programming. We evaluate Edge Transformer on compositional generalization benchmarks in relational reasoning, semantic parsing, and dependency parsing. In all three settings, the Edge Transformer outperforms Relation-aware, Universal and classical Transformer baselines.


page 1

page 2

page 3

page 4


Relational Attention: Generalizing Transformers for Graph-Structured Tasks

Transformers flexibly operate over sets of real-valued vectors represent...

When Can Transformers Ground and Compose: Insights from Compositional Generalization Benchmarks

Humans can reason compositionally whilst grounding language utterances t...

The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization

Despite successes across a broad range of applications, Transformers hav...

Making Transformers Solve Compositional Tasks

Several studies have reported the inability of Transformer models to gen...

Characterizing Intrinsic Compositionality in Transformers with Tree Projections

When trained on language data, do transformers learn some arbitrary comp...

Transformers generalize differently from information stored in context vs in weights

Transformer models can use two fundamentally different kinds of informat...

Diagnosing Transformers in Task-Oriented Semantic Parsing

Modern task-oriented semantic parsing approaches typically use seq2seq t...