Investigating Biases in Textual Entailment Datasets
The ability to understand logical relationships between sentences is an important task in language understanding. To aid in progress for this task, researchers have collected datasets for machine learning and evaluation of current systems. However, like in the crowdsourced Visual Question Answering (VQA) task, some biases in the data inevitably occur. In our experiments, we find that performing classification on just the hypotheses on the SNLI dataset yields an accuracy of 64 MultiNLI dataset, discuss its implication, and propose a simple method to reduce the biases in the datasets.
READ FULL TEXT