CLUTRR: A Diagnostic Benchmark for Inductive Reasoning from Text

08/16/2019
by   Koustuv Sinha, et al.
1

The recent success of natural language understanding (NLU) systems has been troubled by results highlighting the failure of these models to generalize in a systematic and robust way. In this work, we introduce a diagnostic benchmark suite, named CLUTRR, to clarify some key issues related to the robustness and systematicity of NLU systems. Motivated by classic work on inductive logic programming, CLUTRR requires that an NLU system infer kinship relations between characters in short stories. Successful performance on this task requires both extracting relationships between entities, as well as inferring the logical rules governing these relationships. CLUTRR allows us to precisely measure a model's ability for systematic generalization by evaluating on held-out combinations of logical rules, and it allows us to evaluate a model's robustness by adding curated noise facts. Our empirical results highlight a substantial performance gap between state-of-the-art NLU models (e.g., BERT and MAC) and a graph neural network model that works directly with symbolic inputs---with the graph-based model exhibiting both stronger generalization and greater robustness.

READ FULL TEXT
research
03/14/2020

Evaluating Logical Generalization in Graph Neural Networks

Recent research has highlighted the role of relational inductive biases ...
research
06/17/2019

Neural Theorem Provers Do Not Learn Rules Without Exploration

Neural symbolic processing aims to combine the generalization of logical...
research
09/30/2020

Measuring Systematic Generalization in Neural Proof Generation with Transformers

We are interested in understanding how well Transformer language models ...
research
03/20/2022

Differentiable Reasoning over Long Stories – Assessing Systematic Generalisation in Neural Models

Contemporary neural networks have achieved a series of developments and ...
research
11/28/2021

ORCHARD: A Benchmark For Measuring Systematic Generalization of Multi-Hierarchical Reasoning

The ability to reason with multiple hierarchical structures is an attrac...
research
07/09/2018

AI Reasoning Systems: PAC and Applied Methods

Learning and logic are distinct and remarkable approaches to prediction....
research
10/12/2022

CTL++: Evaluating Generalization on Never-Seen Compositional Patterns of Known Functions, and Compatibility of Neural Representations

Well-designed diagnostic tasks have played a key role in studying the fa...

Please sign up or login with your details

Forgot password? Click here to reset