Improving User Controlled Table-To-Text Generation Robustness

02/20/2023
by   Hanxu Hu, et al.
0

In this work we study user controlled table-to-text generation where users explore the content in a table by selecting cells and reading a natural language description thereof automatically produce by a natural language generator. Such generation models usually learn from carefully selected cell combinations (clean cell selections); however, in practice users may select unexpected, redundant, or incoherent cell combinations (noisy cell selections). In experiments, we find that models perform well on test sets coming from the same distribution as the train data but their performance drops when evaluated on realistic noisy user inputs. We propose a fine-tuning regime with additional user-simulated noisy cell selections. Models fine-tuned with the proposed regime gain 4.85 BLEU points on user noisy test cases and 1.4 on clean test cases; and achieve comparable state-of-the-art performance on the ToTTo dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/29/2020

ToTTo: A Controlled Table-To-Text Generation Dataset

We present ToTTo, an open-domain English table-to-text dataset with over...
research
05/24/2023

Large Language Models are Effective Table-to-Text Generators, Evaluators, and Feedback Providers

Large language models (LLMs) have shown remarkable ability on controllab...
research
08/23/2022

Few-Shot Table-to-Text Generation with Prefix-Controlled Generator

Neural table-to-text generation approaches are data-hungry, limiting the...
research
05/08/2022

Robust (Controlled) Table-to-Text Generation with Structure-Aware Equivariance Learning

Controlled table-to-text generation seeks to generate natural language d...
research
05/22/2022

Diversity Enhanced Table-to-Text Generation via Type Control

Generating natural language statements to convey information from tabula...
research
05/24/2022

Medical Scientific Table-to-Text Generation with Human-in-the-Loop under the Data Sparsity Constraint

Structured (tabular) data in the preclinical and clinical domains contai...
research
03/29/2023

RetClean: Retrieval-Based Data Cleaning Using Foundation Models and Data Lakes

Can foundation models (such as ChatGPT) clean your data? In this proposa...

Please sign up or login with your details

Forgot password? Click here to reset