Analysing Data-To-Text Generation Benchmarks

05/10/2017
by   Laura Perez-Beltrachini, et al.
0

Recently, several data-sets associating data to text have been created to train data-to-text surface realisers. It is unclear however to what extent the surface realisation task exercised by these data-sets is linguistically challenging. Do these data-sets provide enough variety to encourage the development of generic, high-quality data-to-text surface realisers ? In this paper, we argue that these data-sets have important drawbacks. We back up our claim using statistics, metrics and manual evaluation. We conclude by eliciting a set of criteria for the creation of a data-to-text benchmark which could help better support the development, evaluation and comparison of linguistically sophisticated data-to-text surface realisers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/19/2017

On the Contribution of Discourse Structure on Text Complexity Assessment

This paper investigates the influence of discourse features on text comp...
research
09/05/2019

MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance

A robust evaluation metric has a profound impact on the development of t...
research
01/04/2023

A Comparison of Fundamental Methods for Iso-surface Extraction

In this paper four fundamental methods for an iso-surface extraction are...
research
02/03/2020

CoTK: An Open-Source Toolkit for Fast Development and Fair Evaluation of Text Generation

In text generation evaluation, many practical issues, such as inconsiste...
research
06/16/2021

Automatic Construction of Evaluation Suites for Natural Language Generation Datasets

Machine learning approaches applied to NLP are often evaluated by summar...
research
06/03/2020

Census of seafloor sediments in the world’s ocean

Knowing the patterns of distribution of sediments in the global ocean is...
research
09/05/2022

Statistical Comparisons of Classifiers by Generalized Stochastic Dominance

Although being a question in the very methodological core of machine lea...

Please sign up or login with your details

Forgot password? Click here to reset