DeepAI AI Chat
Log In Sign Up

Low-Resource Dialogue Summarization with Domain-Agnostic Multi-Source Pretraining

by   Yicheng Zou, et al.
FUDAN University

With the rapid increase in the volume of dialogue data from daily life, there is a growing demand for dialogue summarization. Unfortunately, training a large summarization model is generally infeasible due to the inadequacy of dialogue data with annotated summaries. Most existing works for low-resource dialogue summarization directly pretrain models in other domains, e.g., the news domain, but they generally neglect the huge difference between dialogues and conventional articles. To bridge the gap between out-of-domain pretraining and in-domain fine-tuning, in this work, we propose a multi-source pretraining paradigm to better leverage the external summary data. Specifically, we exploit large-scale in-domain non-summary data to separately pretrain the dialogue encoder and the summary decoder. The combined encoder-decoder model is then pretrained on the out-of-domain summary data using adversarial critics, aiming to facilitate domain-agnostic summarization. The experimental results on two public datasets show that with only limited training data, our approach achieves competitive performance and generalizes well in different dialogue scenarios.


page 1

page 2

page 3

page 4


DIONYSUS: A Pre-trained Model for Low-Resource Dialogue Summarization

Dialogue summarization has recently garnered significant attention due t...

Post-Training Dialogue Summarization using Pseudo-Paraphrasing

Previous dialogue summarization techniques adapt large language models p...

Dialogue Inspectional Summarization with Factual Inconsistency Awareness

Dialogue summarization has been extensively studied and applied, where t...

A Focused Study on Sequence Length for Dialogue Summarization

Output length is critical to dialogue summarization systems. The dialogu...

A Bag of Tricks for Dialogue Summarization

Dialogue summarization comes with its own peculiar challenges as opposed...

Meta-Learning the Difference: Preparing Large Language Models for Efficient Adaptation

Large pretrained language models (PLMs) are often domain- or task-adapte...

Few-Shot Learning of an Interleaved Text Summarization Model by Pretraining with Synthetic Data

Interleaved texts, where posts belonging to different threads occur in a...