DAM: Deliberation, Abandon and Memory Networks for Generating Detailed and Non-repetitive Responses in Visual Dialogue

07/07/2020
by   Xiaoze Jiang, et al.
0

Visual Dialogue task requires an agent to be engaged in a conversation with human about an image. The ability of generating detailed and non-repetitive responses is crucial for the agent to achieve human-like conversation. In this paper, we propose a novel generative decoding architecture to generate high-quality responses, which moves away from decoding the whole encoded semantics towards the design that advocates both transparency and flexibility. In this architecture, word generation is decomposed into a series of attention-based information selection steps, performed by the novel recurrent Deliberation, Abandon and Memory (DAM) module. Each DAM module performs an adaptive combination of the response-level semantics captured from the encoder and the word-level semantics specifically selected for generating each word. Therefore, the responses contain more detailed and non-repetitive descriptions while maintaining the semantic accuracy. Furthermore, DAM is flexible to cooperate with existing visual dialogue encoders and adaptive to the encoder structures by constraining the information selection mode in DAM. We apply DAM to three typical encoders and verify the performance on the VisDial v1.0 dataset. Experimental results show that the proposed models achieve new state-of-the-art performance with high-quality responses. The code is available at https://github.com/JXZe/DAM.

READ FULL TEXT
research
08/27/2018

An Auto-Encoder Matching Model for Learning Utterance-Level Semantic Dependency in Dialogue Generation

Generating semantically coherent responses is still a major challenge in...
research
10/31/2022

Pneg: Prompt-based Negative Response Generation for Dialogue Response Selection Task

In retrieval-based dialogue systems, a response selection model acts as ...
research
02/26/2019

Generative Visual Dialogue System via Adaptive Reasoning and Weighted Likelihood Estimation

The key challenge of generative Visual Dialogue (VD) systems is to respo...
research
10/30/2022

Counterfactual Data Augmentation via Perspective Transition for Open-Domain Dialogues

The construction of open-domain dialogue systems requires high-quality d...
research
08/01/2023

ZRIGF: An Innovative Multimodal Framework for Zero-Resource Image-Grounded Dialogue Generation

Image-grounded dialogue systems benefit greatly from integrating visual ...
research
05/07/2022

Towards a Progression-Aware Autonomous Dialogue Agent

Recent advances in large-scale language modeling and generation have ena...
research
10/10/2020

Cue-word Driven Neural Response Generation with a Shrinking Vocabulary

Open-domain response generation is the task of generating sensible and i...

Please sign up or login with your details

Forgot password? Click here to reset