MathQA: Towards Interpretable Math Word Problem Solving with Operation-Based Formalisms

05/30/2019
by   Aida Amini, et al.
0

We introduce a large-scale dataset of math word problems and an interpretable neural math problem solver that learns to map problems to operation programs. Due to annotation challenges, current datasets in this domain have been either relatively small in scale or did not offer precise operational annotations over diverse problem types. We introduce a new representation language to model precise operation programs corresponding to each math problem that aim to improve both the performance and the interpretability of the learned models. Using this representation language, our new dataset, MathQA, significantly enhances the AQuA dataset with fully-specified operational programs. We additionally introduce a neural sequence-to-program model enhanced with automatic problem categorization. Our experiments show improvements over competitive baselines in our MathQA as well as the AQuA dataset. The results are still significantly lower than human performance indicating that the dataset poses new challenges for future research. Our dataset is available at: https://math-qa.github.io/math-QA/

READ FULL TEXT
research
06/06/2019

ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question Answering

Recent developments in modeling language and vision have been successful...
research
05/30/2021

GeoQA: A Geometric Question Answering Benchmark Towards Multimodal Numerical Reasoning

Automatic math problem solving has recently attracted increasing attenti...
research
01/27/2021

LSOIE: A Large-Scale Dataset for Supervised Open Information Extraction

Open Information Extraction (OIE) systems seek to compress the factual p...
research
09/24/2020

Ape210K: A Large-Scale and Template-Rich Dataset of Math Word Problems

Automatic math word problem solving has attracted growing attention in r...
research
05/25/2023

Type Prediction With Program Decomposition and Fill-in-the-Type Training

TypeScript and Python are two programming languages that support optiona...
research
05/27/2023

Measuring Your ASTE Models in The Wild: A Diversified Multi-domain Dataset For Aspect Sentiment Triplet Extraction

Aspect Sentiment Triplet Extraction (ASTE) is widely used in various app...

Please sign up or login with your details

Forgot password? Click here to reset