Revisiting Challenges in Data-to-Text Generation with Fact Grounding

01/12/2020
by   Hongmin Wang, et al.
0

Data-to-text generation models face challenges in ensuring data fidelity by referring to the correct input source. To inspire studies in this area, Wiseman et al. (2017) introduced the RotoWire corpus on generating NBA game summaries from the box- and line-score tables. However, limited attempts have been made in this direction and the challenges remain. We observe a prominent bottleneck in the corpus where only about 60 the boxscore records. Such information deficiency tends to misguide a conditioned language model to produce unconditioned random facts and thus leads to factual hallucinations. In this work, we restore the information balance and revamp this task to focus on fact-grounded data-to-text generation. We introduce a purified and larger-scale dataset, RotoWire-FG (Fact-Grounding), with 50 attract more research focuses in this direction. Moreover, we achieve improved data fidelity over the state-of-the-art models by integrating a new form of table reconstruction as an auxiliary task to boost the generation quality.

READ FULL TEXT

page 2

page 8

research
05/23/2023

QTSumm: A New Benchmark for Query-Focused Table Summarization

People primarily consult tables to conduct data analysis or answer speci...
research
08/08/2019

Key Fact as Pivot: A Two-Stage Model for Low Resource Table-to-Text Generation

Table-to-text generation aims to translate the structured data into the ...
research
05/19/2023

STOAT: Structured Data to Analytical Text With Controls

Recent language models have made tremendous progress in the structured d...
research
01/05/2023

Towards Table-to-Text Generation with Pretrained Language Model: A Table Structure Understanding and Text Deliberating Approach

Although remarkable progress on the neural table-to-text methods has bee...
research
07/25/2017

Challenges in Data-to-Document Generation

Recent neural models have shown significant progress on the problem of g...
research
10/21/2020

PARENTing via Model-Agnostic Reinforcement Learning to Correct Pathological Behaviors in Data-to-Text Generation

In language generation models conditioned by structured data, the classi...
research
04/08/2020

Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity

End-to-end neural data-to-text (D2T) generation has recently emerged as ...

Please sign up or login with your details

Forgot password? Click here to reset