Out-of-Distribution Generalization in Algorithmic Reasoning Through Curriculum Learning

10/07/2022
by   Andrew J. Nam, et al.
0

Out-of-distribution generalization (OODG) is a longstanding challenge for neural networks, and is quite apparent in tasks with well-defined variables and rules, where explicit use of the rules can solve problems independently of the particular values of the variables. Large transformer-based language models have pushed the boundaries on how well neural networks can generalize to novel inputs, but their complexity obfuscates they achieve such robustness. As a step toward understanding how transformer-based systems generalize, we explore the question of OODG in smaller scale transformers. Using a reasoning task based on the puzzle Sudoku, we show that OODG can occur on complex problems if the training set includes examples sampled from the whole distribution of simpler component tasks.

READ FULL TEXT
research
05/22/2023

Teaching Probabilistic Logical Reasoning to Transformers

Recent research on transformer-based language models investigates their ...
research
07/27/2021

Pointer Value Retrieval: A new benchmark for understanding the limits of neural network generalization

The successes of deep learning critically rely on the ability of neural ...
research
07/05/2022

Neural Networks and the Chomsky Hierarchy

Reliable generalization lies at the heart of safe ML and AI. However, un...
research
02/08/2020

Exploring the Memorization-Generalization Continuum in Deep Learning

Human learners appreciate that some facts demand memorization whereas ot...
research
05/21/2022

Least-to-Most Prompting Enables Complex Reasoning in Large Language Models

We propose a novel prompting strategy, least-to-most prompting, that ena...
research
01/30/2023

Generalization on the Unseen, Logic Reasoning and Degree Curriculum

This paper considers the learning of logical (Boolean) functions with fo...
research
12/16/2021

Pushing the Limits of Rule Reasoning in Transformers through Natural Language Satisfiability

Investigating the reasoning abilities of transformer models, and discove...

Please sign up or login with your details

Forgot password? Click here to reset