Why Does ChatGPT Fall Short in Answering Questions Faithfully?

04/20/2023
by   Shen Zheng, et al.
0

Recent advancements in Large Language Models, such as ChatGPT, have demonstrated significant potential to impact various aspects of human life. However, ChatGPT still faces challenges in aspects like faithfulness. Taking question answering as a representative application, we seek to understand why ChatGPT falls short in answering questions faithfully. To address this question, we attempt to analyze the failures of ChatGPT in complex open-domain question answering and identifies the abilities under the failures. Specifically, we categorize ChatGPT's failures into four types: comprehension, factualness, specificity, and inference. We further pinpoint three critical abilities associated with QA failures: knowledge memorization, knowledge association, and knowledge reasoning. Additionally, we conduct experiments centered on these abilities and propose potential approaches to enhance faithfulness. The results indicate that furnishing the model with fine-grained external knowledge, hints for knowledge association, and guidance for reasoning can empower the model to answer questions more faithfully.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/23/2023

ToolQA: A Dataset for LLM Question Answering with External Tools

Large Language Models (LLMs) have demonstrated impressive performance in...
research
11/21/2019

Temporal Reasoning via Audio Question Answering

Multimodal question answering tasks can be used as proxy tasks to study ...
research
11/27/2019

JEC-QA: A Legal-Domain Question Answering Dataset

We present JEC-QA, the largest question answering dataset in the legal d...
research
05/24/2023

Mixture of Prompt Experts for Generalizable and Interpretable Question Answering

One of the ultimate quests of question answering (QA) is to deploy a sys...
research
04/19/2017

Answering Complex Questions Using Open Information Extraction

While there has been substantial progress in factoid question-answering ...
research
06/01/2018

A Systematic Classification of Knowledge, Reasoning, and Context within the ARC Dataset

The recent work of Clark et al. introduces the AI2 Reasoning Challenge (...
research
04/07/2023

Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering

Causal Video Question Answering (CVidQA) queries not only association or...

Please sign up or login with your details

Forgot password? Click here to reset