Learning to Reason for Text Generation from Scientific Tables

04/16/2021
by   Nafise Sadat Moosavi, et al.
0

In this paper, we introduce SciGen, a new challenge dataset for the task of reasoning-aware data-to-text generation consisting of tables from scientific articles and their corresponding descriptions. Describing scientific tables goes beyond the surface realization of the table content and requires reasoning over table values. The unique properties of SciGen are that (1) tables mostly contain numerical values, and (2) the corresponding descriptions require arithmetic reasoning. SciGen is therefore the first dataset that assesses the arithmetic reasoning capabilities of generation models on complex input structures, i.e., tables from scientific articles. We study the effectiveness of state-of-the-art data-to-text generation models on SciGen and evaluate the results using common metrics as well as human evaluation. Our results and analyses show that (a) while humans like to reason for describing scientific tables, the ability of state-of-the-art models is severely limited on this task, (b) while adding more training data improves the results, it is not the solution for reasoning-aware text generation, and (c) one of the main bottlenecks for this task is the lack of proper automatic evaluation metrics. The data, code, and annotations for human evaluation will be available at https://github.com/UKPLab/SciGen. SciGen opens new avenues for future research in reasoning-aware text generation and evaluation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2023

QTSumm: A New Benchmark for Query-Focused Table Summarization

People primarily consult tables to conduct data analysis or answer speci...
research
05/19/2023

STOAT: Structured Data to Analytical Text With Controls

Recent language models have made tremendous progress in the structured d...
research
10/20/2021

SciXGen: A Scientific Paper Dataset for Context-Aware Text Generation

Generating texts in scientific papers requires not only capturing the co...
research
05/13/2020

INFOTABS: Inference on Tables as Semi-structured Data

In this paper, we observe that semi-structured tabulated text is ubiquit...
research
11/13/2018

Text Assisted Insight Ranking Using Context-Aware Memory Network

Extracting valuable facts or informative summaries from multi-dimensiona...
research
05/01/2023

LST-Bench: Benchmarking Log-Structured Tables in the Cloud

Log-Structured Tables (LSTs), also commonly referred to as table formats...
research
09/15/2021

FORTAP: Using Formulae for Numerical-Reasoning-Aware Table Pretraining

Tables store rich numerical data, but numerical reasoning over tables is...

Please sign up or login with your details

Forgot password? Click here to reset