DG2: Data Augmentation Through Document Grounded Dialogue Generation

12/15/2021
by   Qingyang Wu, et al.
0

Collecting data for training dialog systems can be extremely expensive due to the involvement of human participants and need for extensive annotation. Especially in document-grounded dialog systems, human experts need to carefully read the unstructured documents to answer the users' questions. As a result, existing document-grounded dialog datasets are relatively small-scale and obstruct the effective training of dialogue systems. In this paper, we propose an automatic data augmentation technique grounded on documents through a generative dialogue model. The dialogue model consists of a user bot and agent bot that can synthesize diverse dialogues given an input document, which are then used to train a downstream model. When supplementing the original dataset, our method achieves significant improvement over traditional data augmentation methods. We also achieve great performance in the low-resource setting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/02/2023

Turning Flowchart into Dialog: Plan-based Data Augmentation for Low-Resource Flowchart-grounded Troubleshooting Dialogs

Flowchart-grounded troubleshooting dialogue (FTD) systems, which follow ...
research
05/24/2023

Cross-lingual Data Augmentation for Document-grounded Dialog Systems in Low Resource Languages

This paper proposes a framework to address the issue of data scarcity in...
research
04/14/2023

Task-oriented Document-Grounded Dialog Systems by HLTPR@RWTH for DSTC9 and DSTC10

This paper summarizes our contributions to the document-grounded dialog ...
research
06/17/2022

CookDial: A dataset for task-oriented dialogs grounded in procedural documents

This work presents a new dialog dataset, CookDial, that facilitates rese...
research
04/17/2020

A Survey of Document Grounded Dialogue Systems (DGDS)

Dialogue system (DS) attracts great attention from industry and academia...
research
02/04/2021

Converse, Focus and Guess – Towards Multi-Document Driven Dialogue

We propose a novel task, Multi-Document Driven Dialogue (MD3), in which ...
research
03/21/2016

Data Augmentation via Levy Processes

If a document is about travel, we may expect that short snippets of the ...

Please sign up or login with your details

Forgot password? Click here to reset