FORTAP: Using Formulae for Numerical-Reasoning-Aware Table Pretraining

09/15/2021
by   Zhoujun Cheng, et al.
0

Tables store rich numerical data, but numerical reasoning over tables is still a challenge. In this paper, we find that the spreadsheet formula, which performs calculations on numerical values in tables, is naturally a strong supervision of numerical reasoning. More importantly, large amounts of spreadsheets with expert-made formulae are available on the web and can be obtained easily. FORTAP is the first method for numerical-reasoning-aware table pretraining by leveraging large corpus of spreadsheet formulae. We design two formula pretraining tasks to explicitly guide FORTAP to learn numerical reference and calculation in semi-structured tables. FORTAP achieves state-of-the-art results on two representative downstream tasks, cell type classification and formula prediction, showing great potential of numerical-reasoning-aware pretraining.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/06/2021

TABBIE: Pretrained Representations of Tabular Data

Existing work on tabular representation learning jointly models tables a...
research
05/12/2023

Comprehensive Solution Program Centric Pretraining for Table-and-Text Hybrid Numerical Reasoning

Numerical reasoning over table-and-text hybrid passages, such as financi...
research
05/13/2022

Improving the Numerical Reasoning Skills of Pretrained Language Models

State-of-the-art pretrained language models tend to perform below their ...
research
07/08/2022

OmniTab: Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering

The information in tables can be an important complement to text, making...
research
04/16/2021

Learning to Reason for Text Generation from Scientific Tables

In this paper, we introduce SciGen, a new challenge dataset for the task...
research
04/16/2022

Logical Inference for Counting on Semi-structured Tables

Recently, the Natural Language Inference (NLI) task has been studied for...
research
06/26/2021

SpreadsheetCoder: Formula Prediction from Semi-structured Context

Spreadsheet formula prediction has been an important program synthesis p...

Please sign up or login with your details

Forgot password? Click here to reset