Composition, Attention, or Both?

10/24/2022
by   Ryo Yoshida, et al.
0

In this paper, we propose a novel architecture called Composition Attention Grammars (CAGs) that recursively compose subtrees into a single vector representation with a composition function, and selectively attend to previous structural information with a self-attention mechanism. We investigate whether these components – the composition function and the self-attention mechanism – can both induce human-like syntactic generalization. Specifically, we train language models (LMs) with and without these two components with the model sizes carefully controlled, and evaluate their syntactic generalization performance against six test circuits on the SyntaxGym benchmark. The results demonstrated that the composition function and the self-attention mechanism both play an important role to make LMs more human-like, and closer inspection of linguistic phenomenon implied that the composition function allowed syntactic features, but not semantic features, to percolate into subtree representations.

READ FULL TEXT
research
06/17/2021

Efficient Conformer with Prob-Sparse Attention Mechanism for End-to-EndSpeech Recognition

End-to-end models are favored in automatic speech recognition (ASR) beca...
research
05/13/2020

Memory Controlled Sequential Self Attention for Sound Recognition

In this paper we investigate the importance of the extent of memory in s...
research
11/17/2016

What Do Recurrent Neural Network Grammars Learn About Syntax?

Recurrent neural network grammars (RNNG) are a recently proposed probabi...
research
12/02/2020

Parallel Scheduling Self-attention Mechanism: Generalization and Optimization

Over the past few years, self-attention is shining in the field of deep ...
research
07/09/2020

Attention or memory? Neurointerpretable agents in space and time

In neuroscience, attention has been shown to bidirectionally interact wi...
research
05/11/2022

Quantum Self-Attention Neural Networks for Text Classification

An emerging direction of quantum computing is to establish meaningful qu...
research
03/16/2023

All4One: Symbiotic Neighbour Contrastive Learning via Self-Attention and Redundancy Reduction

Nearest neighbour based methods have proved to be one of the most succes...

Please sign up or login with your details

Forgot password? Click here to reset