An Empirical Investigation of Pre-Trained Transformer Language Models for Open-Domain Dialogue Generation

03/09/2020
by   Piji Li, et al.
0

We present an empirical investigation of pre-trained Transformer-based auto-regressive language models for the task of open-domain dialogue generation. Training paradigm of pre-training and fine-tuning is employed to conduct the parameter learning. Corpora of News and Wikipedia in Chinese and English are collected for the pre-training stage respectively. Dialogue context and response are concatenated into a single sequence utilized as the input of the models during the fine-tuning stage. A weighted joint prediction paradigm for both context and response is designed to evaluate the performance of models with or without the loss term for context prediction. Various of decoding strategies such as greedy search, beam search, top-k sampling, etc. are employed to conduct the response text generation. Extensive experiments are conducted on the typical single-turn and multi-turn dialogue corpora such as Weibo, Douban, Reddit, DailyDialog, and Persona-Chat. Detailed numbers of automatic evaluation metrics on relevance and diversity of the generated results for the languages models as well as the baseline approaches are reported.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/03/2021

EVA: An Open-Domain Chinese Dialogue System with Large-Scale Generative Pre-Training

Although pre-trained language models have remarkably enhanced the genera...
research
04/05/2020

Semantics of the Unwritten

The semantics of a text is manifested not only by what is read, but also...
research
05/18/2023

SimOAP: Improve Coherence and Consistency in Persona-based Dialogue Generation via Over-sampling and Post-evaluation

Language models trained on large-scale corpora can generate remarkably f...
research
07/28/2022

Persona-Knowledge Dialogue Multi-Context Retrieval and Enhanced Decoding Methods

Persona and Knowledge dual context open-domain chat is a novel dialogue ...
research
04/14/2023

Learn What Is Possible, Then Choose What Is Best: Disentangling One-To-Many Relations in Language Through Text-based Games

Language models pre-trained on large self-supervised corpora, followed b...
research
05/04/2020

A New Data Normalization Method to Improve Dialogue Generation by Minimizing Long Tail Effect

Recent neural models have shown significant progress in dialogue generat...
research
10/06/2020

StyleDGPT: Stylized Response Generation with Pre-trained Language Models

Generating responses following a desired style has great potentials to e...

Please sign up or login with your details

Forgot password? Click here to reset