Multi-turn Dialogue Response Generation with Autoregressive Transformer Models

07/26/2019
by   Oluwatobi Olabiyi, et al.
0

Neural dialogue models, despite their successes, still suffer from lack of relevance, diversity, and in many cases coherence in their generated responses. These issues have been attributed to reasons including (1) short-range model architectures that capture limited temporal dependencies, (2) limitations of the maximum likelihood training objective, (3) the concave entropy profile of dialogue datasets resulting into short and generic responses, and (4) out-of-vocabulary problem leading to generation of a large number of <UNK> tokens. Autoregressive transformer models such as GPT-2, although trained with the maximum likelihood objective, do not suffer from the out-of-vocabulary problem and have demonstrated an excellent ability to capture long-range structures in language modeling tasks. In this paper, we examine the use of autoregressive transformer models for multi-turn dialogue response generation. In our experiments, we employ small and medium GPT-2 models (with publicly available pretrained language model parameters) on the open-domain Movie Triples dataset and the closed-domain Ubuntu Dialogue dataset. The models (with and without pretraining) achieve significant improvements over the baselines for multi-turn dialogue response generation. They also produce state-of-the-art performance on the two datasets based on several metrics, including BLEU, ROGUE, and distinct n-gram.

READ FULL TEXT
research
10/12/2020

Meta-Context Transformers for Domain-Specific Response Generation

Despite the tremendous success of neural dialogue models in recent years...
research
09/03/2019

Adversarial Bootstrapping for Dialogue Model Training

Open domain neural dialogue models, despite their successes, are known t...
research
09/09/2021

Thinking Clearly, Talking Fast: Concept-Guided Non-Autoregressive Generation for Open-Domain Dialogue Systems

Human dialogue contains evolving concepts, and speakers naturally associ...
research
08/27/2020

Improvement of a dedicated model for open domain persona-aware dialogue generation

This paper analyzes some speed and performance improvement methods of Tr...
research
07/05/2021

DeepRapper: Neural Rap Generation with Rhyme and Rhythm Modeling

Rap generation, which aims to produce lyrics and corresponding singing b...
research
03/18/2021

Pretraining the Noisy Channel Model for Task-Oriented Dialogue

Direct decoding for task-oriented dialogue is known to suffer from the e...
research
04/17/2023

An Empirical Study of Multitask Learning to Improve Open Domain Dialogue Systems

Autoregressive models used to generate responses in open-domain dialogue...

Please sign up or login with your details

Forgot password? Click here to reset