Information-Theoretic Text Hallucination Reduction for Video-grounded Dialogue

12/12/2022
by   Sunjae Yoon, et al.
0

Video-grounded Dialogue (VGD) aims to decode an answer sentence to a question regarding a given video and dialogue context. Despite the recent success of multi-modal reasoning to generate answer sentences, existing dialogue systems still suffer from a text hallucination problem, which denotes indiscriminate text-copying from input texts without an understanding of the question. This is due to learning spurious correlations from the fact that answer sentences in the dataset usually include the words of input texts, thus the VGD system excessively relies on copying words from input texts by hoping those words to overlap with ground-truth texts. Hence, we design Text Hallucination Mitigating (THAM) framework, which incorporates Text Hallucination Regularization (THR) loss derived from the proposed information-theoretic text hallucination measurement approach. Applying THAM with current dialogue systems validates the effectiveness on VGD benchmarks (i.e., AVSD@DSTC7 and AVSD@DSTC8) and shows enhanced interpretability.

READ FULL TEXT
research
03/01/2021

Learning Reasoning Paths over Semantic Graphs for Video-grounded Dialogues

Compared to traditional visual question answering, video-grounded dialog...
research
05/30/2023

VSTAR: A Video-grounded Dialogue Dataset for Situated Semantic Understanding with Scene and Topic Transitions

Video-grounded dialogue understanding is a challenging problem that requ...
research
03/24/2021

Structured Co-reference Graph Attention for Video-grounded Dialogue

A video-grounded dialogue system referred to as the Structured Co-refere...
research
05/17/2023

IMAD: IMage-Augmented multi-modal Dialogue

Currently, dialogue systems have achieved high performance in processing...
research
10/07/2015

Using Ontology-Based Context in the Portuguese-English Translation of Homographs in Textual Dialogues

This paper introduces a novel approach to tackle the existing gap on mes...
research
01/30/2023

Using n-aksaras to model Sanskrit and Sanskrit-adjacent texts

Despite – or perhaps because of – their simplicity, n-grams, or contiguo...
research
12/29/2014

Quantifying origin and character of long-range correlations in narrative texts

In natural language using short sentences is considered efficient for co...

Please sign up or login with your details

Forgot password? Click here to reset