Relational Weight Priors in Neural Networks for Abstract Pattern Learning and Language Modelling

03/10/2021
by   Radha Kopparti, et al.
0

Deep neural networks have become the dominant approach in natural language processing (NLP). However, in recent years, it has become apparent that there are shortcomings in systematicity that limit the performance and data efficiency of deep learning in NLP. These shortcomings can be clearly shown in lower-level artificial tasks, mostly on synthetic data. Abstract patterns are the best known examples of a hard problem for neural networks in terms of generalisation to unseen data. They are defined by relations between items, such as equality, rather than their values. It has been argued that these low-level problems demonstrate the inability of neural networks to learn systematically. In this study, we propose Embedded Relation Based Patterns (ERBP) as a novel way to create a relational inductive bias that encourages learning equality and distance-based relations for abstract patterns. ERBP is based on Relation Based Patterns (RBP), but modelled as a Bayesian prior on network weights and implemented as a regularisation term in otherwise standard network learning. ERBP is is easy to integrate into standard neural networks and does not affect their learning capacity. In our experiments, ERBP priors lead to almost perfect generalisation when learning abstract patterns from synthetic noise-free sequences. ERBP also improves natural language models on the word and character level and pitch prediction in melodies with RNN, GRU and LSTM networks. We also find improvements in in the more complex tasks of learning of graph edit distance and compositional sentence entailment. ERBP consistently improves over RBP and over standard networks, showing that it enables abstract pattern learning which contributes to performance in natural language tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/06/2020

Weight Priors for Learning Identity Relations

Learning abstract and systematic relations has been an open issue in neu...
research
12/06/2018

Modelling Identity Rules with Neural Networks

In this paper, we show that standard feed-forward and recurrent neural n...
research
12/04/2018

Feed-Forward Neural Networks Need Inductive Bias to Learn Equality Relations

Basic binary relations such as equality and inequality are fundamental t...
research
08/28/2018

Explaining Character-Aware Neural Networks for Word-Level Prediction: Do They Discover Linguistic Rules?

Character-level features are currently used in different neural network-...
research
10/05/2021

Leveraging the Inductive Bias of Large Language Models for Abstract Textual Reasoning

Large natural language models (such as GPT-3 or T5) demonstrate impressi...
research
06/13/2019

Factors for the Generalisation of Identity Relations by Neural Networks

Many researchers implicitly assume that neural networks learn relations ...
research
01/31/2023

Recursive Neural Networks with Bottlenecks Diagnose (Non-)Compositionality

A recent line of work in NLP focuses on the (dis)ability of models to ge...

Please sign up or login with your details

Forgot password? Click here to reset