Evaluating Compositionality in Sentence Embeddings

02/12/2018
by   Ishita Dasgupta, et al.
0

An important frontier in the quest for human-like AI is compositional semantics: how do we design systems that understand an infinite number of expressions built from a finite vocabulary? Recent research has attempted to solve this problem by using deep neural networks to learn vector space embeddings of sentences, which then serve as input to supervised learning problems like paraphrase detection and sentiment analysis. Here we focus on 'natural language inference' (NLI) as a critical test of a system's capacity for semantic compositionality. In the NLI task, sentence pairs are assigned one of three categories: entailment, contradiction, or neutral. We present a new set of NLI sentence pairs that cannot be solved using only word-level knowledge and instead require some degree of compositionality. We use state of the art sentence embeddings trained on NLI (InferSent, Conneau et al. (2017)), and find that performance on our new dataset is poor, indicating that the representations learned by this model fail to capture the needed compositionality. We analyze some of the decision rules learned by InferSent and find that they are largely driven by simple heuristics at the word level that are ecologically valid in the SNLI dataset on which InferSent is trained. Further, we find that augmenting the training dataset with our new dataset improves performance on a held-out test set without loss of performance on the SNLI test set. This highlights the importance of structured datasets in better understanding, as well as improving the performance of, AI systems.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset