Investigating Biases in Textual Entailment Datasets

06/23/2019
by   Shawn Tan, et al.
0

The ability to understand logical relationships between sentences is an important task in language understanding. To aid in progress for this task, researchers have collected datasets for machine learning and evaluation of current systems. However, like in the crowdsourced Visual Question Answering (VQA) task, some biases in the data inevitably occur. In our experiments, we find that performing classification on just the hypotheses on the SNLI dataset yields an accuracy of 64 MultiNLI dataset, discuss its implication, and propose a simple method to reduce the biases in the datasets.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset