Are NLP Models really able to Solve Simple Math Word Problems?

03/12/2021
by   Arkil Patel, et al.
0

The problem of designing NLP solvers for math word problems (MWP) has seen sustained research activity and steady gains in the test accuracy. Since existing solvers achieve high performance on the benchmark datasets for elementary level MWPs containing one-unknown arithmetic word problems, such problems are often considered "solved" with the bulk of research attention moving to more complex MWPs. In this paper, we restrict our attention to English MWPs taught in grades four and lower. We provide strong evidence that the existing MWP solvers rely on shallow heuristics to achieve high performance on the benchmark datasets. To this end, we show that MWP solvers that do not have access to the question asked in the MWP can still solve a large fraction of MWPs. Similarly, models that treat MWPs as bag-of-words can also achieve surprisingly high accuracy. Further, we introduce a challenge dataset, SVAMP, created by applying carefully chosen variations over examples sampled from existing datasets. The best accuracy achieved by state-of-the-art models is substantially lower on SVAMP, thus showing that much remains to be done even for the simplest of the MWPs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/30/2022

Practice Makes a Solver Perfect: Data Augmentation for Math Word Problem Solvers

Existing Math Word Problem (MWP) solvers have achieved high accuracy on ...
research
05/31/2022

Why are NLP Models Fumbling at Elementary Math? A Survey of Deep Learning based Word Problem Solvers

From the latter half of the last decade, there has been a growing intere...
research
09/13/2021

Adversarial Examples for Evaluating Math Word Problem Solvers

Standard accuracy metrics have shown that Math Word Problem (MWP) solver...
research
07/24/2023

Explaining Math Word Problem Solvers

Automated math word problem solvers based on neural networks have succes...
research
10/27/2021

Training Verifiers to Solve Math Word Problems

State-of-the-art language models can match human performance on many tas...
research
04/10/2023

On Evaluation of Bangla Word Analogies

This paper presents a high-quality dataset for evaluating the quality of...
research
04/10/2021

Convergence of Adaptive, Randomized, Iterative Linear Solvers

Deterministic and randomized, row-action and column-action linear solver...

Please sign up or login with your details

Forgot password? Click here to reset