Testing the Generalization Power of Neural Network Models Across NLI Benchmarks

10/23/2018
by   Aarne Talman, et al.
0

Neural network models have been very successful for natural language inference, with the best models reaching 90 However, the success of these models turns out to be largely benchmark specific. We show that models trained on natural language inference dataset drawn from one benchmark fail to perform well in others, even if the notion of inference assumed in these benchmark tasks is the same or similar. We train five state-of-the-art neural network models on different datasets and show that each one of these fail to generalize outside of the respective benchmark. In light of these results we conclude that the current neural network models are not able to generalize in capturing the semantics of natural language inference, but seem to be overfitting to the specific dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/23/2018

Neural Network Models for Natural Language Inference Fail to Capture the Semantics of Inference

Neural network models have been very successful for natural language inf...
research
04/22/2019

Understanding Roles and Entities: Datasets and Models for Natural Language Inference

We present two new datasets and a novel attention mechanism for Natural ...
research
11/07/2019

BERTs of a feather do not generalize together: Large variability in generalization across models with similar test set performance

If the same neural architecture is trained multiple times on the same da...
research
04/22/2021

Finding Fuzziness in Neural Network Models of Language Processing

Humans often communicate by using imprecise language, suggesting that fu...
research
09/19/2019

Improving Generalization by Incorporating Coverage in Natural Language Inference

The task of natural language inference (NLI) is to identify the relation...
research
11/03/2019

Posing Fair Generalization Tasks for Natural Language Inference

Deep learning models for semantics are generally evaluated using natural...
research
08/26/2018

Event Detection with Neural Networks: A Rigorous Empirical Evaluation

Detecting events and classifying them into predefined types is an import...

Please sign up or login with your details

Forgot password? Click here to reset