A Sequence-to-Sequence Set Model for Text-to-Table Generation

05/31/2023
by   Tong Li, et al.
0

Recently, the text-to-table generation task has attracted increasing attention due to its wide applications. In this aspect, the dominant model formalizes this task as a sequence-to-sequence generation task and serializes each table into a token sequence during training by concatenating all rows in a top-down order. However, it suffers from two serious defects: 1) the predefined order introduces a wrong bias during training, which highly penalizes shifts in the order between rows; 2) the error propagation problem becomes serious when the model outputs a long token sequence. In this paper, we first conduct a preliminary study to demonstrate the generation of most rows is order-insensitive. Furthermore, we propose a novel sequence-to-sequence set text-to-table generation model. Specifically, in addition to a text encoder encoding the input text, our model is equipped with a table header generator to first output a table header, i.e., the first row of the table, in the manner of sequence generation. Then we use a table body generator with learnable row embeddings and column embeddings to generate a set of table body rows in parallel. Particularly, to deal with the issue that there is no correspondence between each generated table body row and target during training, we propose a target assignment strategy based on the bipartite matching between the first cells of generated table body rows and targets. Experiment results show that our model significantly surpasses the baselines, achieving state-of-the-art performance on commonly-used datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/24/2021

One2Set: Generating Diverse Keyphrases as a Set

Recently, the sequence-to-sequence models have made remarkable progress ...
research
03/01/2022

TableFormer: Robust Transformer Modeling for Table-Text Encoding

Understanding tables is an important aspect of natural language understa...
research
09/05/2019

Table-to-Text Generation with Effective Hierarchical Encoder on Three Dimensions (Row, Column and Time)

Although Seq2Seq models for table-to-text generation have achieved remar...
research
05/08/2022

Robust (Controlled) Table-to-Text Generation with Structure-Aware Equivariance Learning

Controlled table-to-text generation seeks to generate natural language d...
research
01/31/2023

The Power of External Memory in Increasing Predictive Model Capacity

One way of introducing sparsity into deep networks is by attaching an ex...
research
08/18/2021

Table Caption Generation in Scholarly Documents Leveraging Pre-trained Language Models

This paper addresses the problem of generating table captions for schola...
research
06/30/2018

Title Generation for Web Tables

Descriptive titles provide crucial context for interpreting tables that ...

Please sign up or login with your details

Forgot password? Click here to reset