EQUATE: A Benchmark Evaluation Framework for Quantitative Reasoning in Natural Language Inference

01/11/2019
by   Abhilasha Ravichander, et al.
0

Quantitative reasoning is an important component of reasoning that any intelligent natural language understanding system can reasonably be expected to handle. We present EQUATE (Evaluating Quantitative Understanding Aptitude in Textual Entailment), a new dataset to evaluate the ability of models to reason with quantities in textual entailment (including not only arithmetic and algebraic computation, but also other phenomena such as range comparisons and verbal reasoning with quantities). The average performance of 7 published textual entailment models on EQUATE does not exceed a majority class baseline, indicating that current models do not implicitly learn to reason with quantities. We propose a new baseline Q-REAS that manipulates quantities symbolically, achieving some success on numerical reasoning, but struggling at more verbal aspects of the task. We hope our evaluation framework will support the development of new models of quantitative reasoning in language understanding.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/09/2017

Natural Language Inference from Multiple Premises

We define a novel textual entailment task that requires inference over m...
research
05/17/2021

Factoring Statutory Reasoning as Language Understanding Challenges

Statutory reasoning is the task of determining whether a legal statute, ...
research
05/11/2023

GeoGLUE: A GeoGraphic Language Understanding Evaluation Benchmark

With a fast developing pace of geographic applications, automatable and ...
research
10/06/2020

A Survey on Recognizing Textual Entailment as an NLP Evaluation

Recognizing Textual Entailment (RTE) was proposed as a unified evaluatio...
research
08/28/2018

Mapping Natural Language Commands to Web Elements

The web provides a rich, open-domain environment with textual, structura...
research
12/03/2016

Unit Dependency Graph and its Application to Arithmetic Word Problem Solving

Math word problems provide a natural abstraction to a range of natural l...
research
06/04/2019

How Large Are Lions? Inducing Distributions over Quantitative Attributes

Most current NLP systems have little knowledge about quantitative attrib...

Please sign up or login with your details

Forgot password? Click here to reset