Injecting Numerical Reasoning Skills into Language Models

04/09/2020
by   Mor Geva, et al.
0

Large pre-trained language models (LMs) are known to encode substantial amounts of linguistic information. However, high-level reasoning skills, such as numerical reasoning, are difficult to learn from a language-modeling objective only. Consequently, existing models for numerical reasoning have used specialized architectures with limited flexibility. In this work, we show that numerical reasoning is amenable to automatic data generation, and thus one can inject this skill into pre-trained LMs, by generating large amounts of data, and training in a multi-task setup. We show that pre-training our model, GenBERT, on this data, dramatically improves performance on DROP (49.3 → 72.3 F1), reaching performance that matches state-of-the-art models of comparable size, while using a simple and general-purpose encoder-decoder architecture. Moreover, GenBERT generalizes well to math word problem datasets, while maintaining high performance on standard RC tasks. Our approach provides a general recipe for injecting skills into large pre-trained LMs, whenever the skill is amenable to automatic data augmentation.

READ FULL TEXT
research
07/15/2021

Turning Tables: Generating Examples from Semi-structured Tables for Endowing Language Models with Reasoning Skills

Models pre-trained with a language modeling objective possess ample worl...
research
04/15/2021

NT5?! Training T5 to Perform Numerical Reasoning

Numerical reasoning over text (NRoT) presents unique challenges that are...
research
09/17/2023

Performance of the Pre-Trained Large Language Model GPT-4 on Automated Short Answer Grading

Automated Short Answer Grading (ASAG) has been an active area of machine...
research
11/03/2022

Overcoming Barriers to Skill Injection in Language Modeling: Case Study in Arithmetic

Through their transfer learning abilities, highly-parameterized large pr...
research
05/27/2023

FERMAT: An Alternative to Accuracy for Numerical Reasoning

While pre-trained language models achieve impressive performance on vari...
research
11/15/2022

Teaching Algorithmic Reasoning via In-context Learning

Large language models (LLMs) have shown increasing in-context learning c...
research
06/11/2020

Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge

To what extent can a neural network systematically reason over symbolic ...

Please sign up or login with your details

Forgot password? Click here to reset