ChatGPT and Simple Linguistic Inferences: Blind Spots and Blinds

05/24/2023
by   Victoria Basmov, et al.
0

This paper sheds light on the limitations of ChatGPT's understanding capabilities, focusing on simple inference tasks that are typically easy for humans but appear to be challenging for the model. Specifically, we target (i) grammatically-specified entailments, (ii) premises with evidential adverbs of uncertainty, and (iii) monotonicity entailments. We present expert-designed evaluation sets for these inference types and conduct experiments in a zero-shot setup. Our results show that the model struggles with these types of inferences, exhibiting moderate to low accuracy. Moreover, while ChatGPT demonstrates knowledge of the underlying linguistic concepts when prompted directly, it often fails to incorporate this knowledge to make correct inferences. Even more strikingly, further experiments show that embedding the premise under presupposition triggers or non-factive verbs causes the model to predict entailment more frequently regardless of the correct semantic label. Overall these results suggest that, despite GPT's celebrated language understanding capacity, ChatGPT has blindspots with respect to certain types of entailment, and that certain entailment-cancelling features act as “blinds” overshadowing the semantics of the embedded premise. Our analyses emphasize the need for further research into the linguistic comprehension and reasoning capabilities of LLMs, in order to improve their reliability, and establish their trustworthiness for real-world applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/07/2020

Are Natural Language Inference Models IMPPRESsive? Learning IMPlicature and PRESupposition

Natural language inference (NLI) is an increasingly important task for n...
research
04/16/2021

Multivalent Entailment Graphs for Question Answering

Drawing inferences between open-domain natural language predicates is a ...
research
05/26/2023

Entailment as Robust Self-Learner

Entailment has been recognized as an important metric for evaluating nat...
research
07/05/2023

SpaceNLI: Evaluating the Consistency of Predicting Inferences in Space

While many natural language inference (NLI) datasets target certain sema...
research
11/02/2016

Ordinal Common-sense Inference

Humans have the capacity to draw common-sense inferences from natural la...
research
04/02/2019

Temporal and Aspectual Entailment

Inferences regarding "Jane's arrival in London" from predications such a...
research
10/31/2019

Harnessing the richness of the linguistic signal in predicting pragmatic inferences

The strength of pragmatic inferences systematically depends on linguisti...

Please sign up or login with your details

Forgot password? Click here to reset