SideControl: Controlled Open-domain Dialogue Generation via Additive Side Networks

09/05/2021
by   Wanyu Du, et al.
0

Transformer-based pre-trained language models boost the performance of open-domain dialogue systems. Prior works leverage Transformer-based pre-trained language models to generate texts with desired attributes in two general approaches: (1) gradient-based methods: updating all latent representations of pre-trained models with gradients from attribute models; (2) weighted-decoding methods: re-ranking beam candidates from pre-trained models with attribute functions. However, gradient-based methods lead to high computation cost and can easily get overfitted on small training sets, while weighted-decoding methods are inherently constrained by the low-variance high-bias pre-trained model. In this work, we propose a novel approach to control the generation of Transformer-based pre-trained language models: the SideControl framework, which leverages a novel control attributes loss to incorporate useful control signals, and is shown to perform well with very limited training samples. We evaluate our proposed method on two benchmark open-domain dialogue datasets, and results show that the SideControl framework has better controllability, higher generation quality and better sample-efficiency than existing gradient-based and weighted-decoding baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/19/2020

Are Pre-trained Language Models Knowledgeable to Ground Open Domain Dialogues?

We study knowledge-grounded dialogue generation with pre-trained languag...
research
02/02/2023

Commonsense-Aware Prompting for Controllable Empathetic Dialogue Generation

Improving the emotional awareness of pre-trained language models is an e...
research
06/08/2022

Few-shot Prompting Towards Controllable Response Generation

Much literature has shown that prompt-based learning is an efficient met...
research
05/20/2022

How Useful are Gradients for OOD Detection Really?

One critical challenge in deploying highly performant machine learning m...
research
10/24/2020

Open-Domain Dialogue Generation Based on Pre-trained Language Models

Pre-trained language models have been successfully used in response gene...
research
06/05/2023

Structured Voronoi Sampling

Recently, there has been a growing interest in the development of gradie...
research
03/09/2023

Planning with Large Language Models for Code Generation

Existing large language model-based code generation pipelines typically ...

Please sign up or login with your details

Forgot password? Click here to reset