Neural Networks and the Chomsky Hierarchy

07/05/2022
by   Grégoire Delétang, et al.
3

Reliable generalization lies at the heart of safe ML and AI. However, understanding when and how neural networks generalize remains one of the most important unsolved problems in the field. In this work, we conduct an extensive empirical study (2200 models, 16 tasks) to investigate whether insights from the theory of computation can predict the limits of neural network generalization in practice. We demonstrate that grouping tasks according to the Chomsky hierarchy allows us to forecast whether certain architectures will be able to generalize to out-of-distribution inputs. This includes negative results where even extensive amounts of data and training time never led to any non-trivial generalization, despite models having sufficient capacity to perfectly fit the training data. Our results show that, for our subset of tasks, RNNs and Transformers fail to generalize on non-regular tasks, LSTMs can solve regular and counter-language tasks, and only networks augmented with structured memory (such as a stack or memory tape) can successfully generalize on context-free and context-sensitive tasks.

READ FULL TEXT
research
09/08/2018

Context-Free Transductions with Neural Stacks

This paper analyzes the behavior of stack-augmented recurrent neural net...
research
10/07/2022

Out-of-Distribution Generalization in Algorithmic Reasoning Through Curriculum Learning

Out-of-distribution generalization (OODG) is a longstanding challenge fo...
research
11/22/2022

Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions

Despite the widespread success of Transformers on NLP tasks, recent work...
research
07/20/2023

Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization

Despite extensive studies, the underlying reason as to why overparameter...
research
07/27/2021

Pointer Value Retrieval: A new benchmark for understanding the limits of neural network generalization

The successes of deep learning critically rely on the ability of neural ...
research
07/01/2019

Understanding Memory Modules on Learning Simple Algorithms

Recent work has shown that memory modules are crucial for the generaliza...
research
02/12/2021

A Too-Good-to-be-True Prior to Reduce Shortcut Reliance

Despite their impressive performance in object recognition and other tas...

Please sign up or login with your details

Forgot password? Click here to reset