ViGGO: A Video Game Corpus for Data-To-Text Generation in Open-Domain Conversation

10/26/2019
by   Juraj Juraska, et al.
0

The uptake of deep learning in natural language generation (NLG) led to the release of both small and relatively large parallel corpora for training neural models. The existing data-to-text datasets are, however, aimed at task-oriented dialogue systems, and often thus limited in diversity and versatility. They are typically crowdsourced, with much of the noise left in them. Moreover, current neural NLG models do not take full advantage of large training data, and due to their strong generalizing properties produce sentences that look template-like regardless. We therefore present a new corpus of 7K samples, which (1) is clean despite being crowdsourced, (2) has utterances of 9 generalizable and conversational dialogue act types, making it more suitable for open-domain dialogue systems, and (3) explores the domain of video games, which is new to dialogue systems despite having excellent potential for supporting rich conversations.

READ FULL TEXT
research
06/04/2019

Training Neural Response Selection for Task-Oriented Dialogue Systems

Despite their popularity in the chatbot literature, retrieval-based mode...
research
05/01/2018

Exploring Conversational Language Generation for Rich Content about Hotels

Dialogue systems for hotel and tourist information have typically simpli...
research
05/09/2020

Diversifying Dialogue Generation with Non-Conversational Text

Neural network-based sequence-to-sequence (seq2seq) models strongly suff...
research
09/14/2018

Characterizing Variation in Crowd-Sourced Data for Training Neural Language Generators to Produce Stylistically Varied Outputs

One of the biggest challenges of end-to-end language generation from mea...
research
05/24/2022

A Dataset for Sentence Retrieval for Open-Ended Dialogues

We address the task of sentence retrieval for open-ended dialogues. The ...
research
04/06/2020

Data Manipulation: Towards Effective Instance Learning for Neural Dialogue Generation via Learning to Augment and Reweight

Current state-of-the-art neural dialogue models learn from human convers...
research
11/03/2020

Conditioned Text Generation with Transfer for Closed-Domain Dialogue Systems

Scarcity of training data for task-oriented dialogue systems is a well k...

Please sign up or login with your details

Forgot password? Click here to reset