Causal Document-Grounded Dialogue Pre-training

05/18/2023
by   Yingxiu Zhao, et al.
3

The goal of document-grounded dialogue (DocGD) is to generate a response by grounding the evidence in a supporting document in accordance with the dialogue context. This process involves four variables that are causally connected. Recently, task-specific pre-training has greatly boosted performances on many downstream tasks. Existing DocGD methods, however, continue to rely on general pre-trained language models without a specifically tailored pre-training approach that explicitly captures the causal relationships. To tackle this issue, we are the first to present a causally-complete dataset construction strategy for building million-level DocGD pre-training corpora. To better capture causality, we further propose a causally-perturbed pre-training strategy, which introduces causal perturbations on the variables and optimizes the overall causal effect. Experiments on three benchmark datasets demonstrate that our causal pre-training achieves considerable and consistent improvements under fully-supervised, low-resource, few-shot, and zero-shot settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/19/2022

Semantic-based Pre-training for Dialogue Understanding

Pre-trained language models have made great progress on dialogue tasks. ...
research
10/05/2020

KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation

Data-to-text generation has recently attracted substantial interests due...
research
03/19/2022

Learning-by-Narrating: Narrative Pre-Training for Zero-Shot Dialogue Comprehension

Comprehending a dialogue requires a model to capture diverse kinds of ke...
research
06/01/2021

Dialogue-oriented Pre-training

Pre-trained language models (PrLM) has been shown powerful in enhancing ...
research
08/01/2023

ZRIGF: An Innovative Multimodal Framework for Zero-Resource Image-Grounded Dialogue Generation

Image-grounded dialogue systems benefit greatly from integrating visual ...
research
04/14/2023

Learn What Is Possible, Then Choose What Is Best: Disentangling One-To-Many Relations in Language Through Text-based Games

Language models pre-trained on large self-supervised corpora, followed b...
research
09/12/2023

Circuit Breaking: Removing Model Behaviors with Targeted Ablation

Language models often exhibit behaviors that improve performance on a pr...

Please sign up or login with your details

Forgot password? Click here to reset