Crowd-sourcing NLG Data: Pictures Elicit Better Data

08/01/2016
by   Jekaterina Novikova, et al.
0

Recent advances in corpus-based Natural Language Generation (NLG) hold the promise of being easily portable across domains, but require costly training data, consisting of meaning representations (MRs) paired with Natural Language (NL) utterances. In this work, we propose a novel framework for crowdsourcing high quality NLG training data, using automatic quality control measures and evaluating different MRs with which to elicit data. We show that pictorial MRs result in better NL data being collected than logic-based MRs: utterances elicited by pictorial MRs are judged as significantly more natural, more informative, and better phrased, with a significant increase in average quality ratings (around 0.5 points on a 6-point scale), compared to using the logical MRs. As the MR becomes more complex, the benefits of pictorial stimuli increase. The collected data will be released as part of this submission.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/08/2019

A Good Sample is Hard to Find: Noise Injection Sampling and Self-Training for Neural Language Generation Models

Deep neural networks (DNN) are quickly becoming the de facto standard mo...
research
09/14/2018

Characterizing Variation in Crowd-Sourced Data for Training Neural Language Generators to Produce Stylistically Varied Outputs

One of the biggest challenges of end-to-end language generation from mea...
research
12/25/2018

Building a Neural Semantic Parser from a Domain Ontology

Semantic parsing is the task of converting natural language utterances i...
research
01/12/2020

Stochastic Natural Language Generation Using Dependency Information

This article presents a stochastic corpus-based model for generating nat...
research
10/31/2022

The Effect of Multiple Replies for Natural Language Generation Chatbots

In this research, by responding to users' utterances with multiple repli...
research
01/06/2016

Language to Logical Form with Neural Attention

Semantic parsing aims at mapping natural language to machine interpretab...
research
01/12/2021

Transforming Multi-Conditioned Generation from Meaning Representation

In task-oriented conversation systems, natural language generation syste...

Please sign up or login with your details

Forgot password? Click here to reset