ChatGPT is a Knowledgeable but Inexperienced Solver: An Investigation of Commonsense Problem in Large Language Models

03/29/2023
by   Ning Bian, et al.
0

Large language models (LLMs) such as ChatGPT and GPT-4 have made significant progress in NLP. However, their ability to memorize, represent, and leverage commonsense knowledge has been a well-known pain point for LLMs. It remains unclear that: (1) Can GPTs effectively answer commonsense questions? (2) Are GPTs knowledgeable in commonsense? (3) Are GPTs aware of the underlying commonsense knowledge for answering a specific question? (4) Can GPTs effectively leverage commonsense for answering questions? To evaluate the above commonsense problems, we conduct a series of experiments to evaluate ChatGPT's commonsense abilities, and the experimental results show that: (1) GPTs can achieve good QA accuracy in commonsense tasks, while they still struggle with certain types of knowledge. (2) ChatGPT is knowledgeable, and can accurately generate most of the commonsense knowledge using knowledge prompts. (3) Despite its knowledge, ChatGPT is an inexperienced commonsense problem solver, which cannot precisely identify the needed commonsense knowledge for answering a specific question, i.e., ChatGPT does not precisely know what commonsense knowledge is required to answer a question. The above findings raise the need to investigate better mechanisms for utilizing commonsense knowledge in LLMs, such as instruction following, better commonsense guidance, etc.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/12/2022

CIKQA: Learning Commonsense Inference with a Unified Knowledge-in-the-loop QA Paradigm

Recently, the community has achieved substantial progress on many common...
research
01/04/2021

Benchmarking Knowledge-Enhanced Commonsense Question Answering via Knowledge-to-Text Transformation

A fundamental ability of humans is to utilize commonsense knowledge in l...
research
04/22/2019

SocialIQA: Commonsense Reasoning about Social Interactions

We introduce SocialIQa, the first large-scale benchmark for commonsense ...
research
09/11/2021

Semantic Categorization of Social Knowledge for Commonsense Question Answering

Large pre-trained language models (PLMs) have led to great success on va...
research
05/12/2020

WinoWhy: A Deep Diagnosis of Essential Commonsense Knowledge for Answering Winograd Schema Challenge

In this paper, we present the first comprehensive categorization of esse...
research
07/04/2021

Coarse-to-Careful: Seeking Semantic-related Knowledge for Open-domain Commonsense Question Answering

It is prevalent to utilize external knowledge to help machine answer que...
research
12/16/2021

DREAM: Uncovering Mental Models behind Language Models

To what extent do language models (LMs) build "mental models" of a scene...

Please sign up or login with your details

Forgot password? Click here to reset