Introduction
Solving math word problems (MWPs) poses unique challenges for understanding naturallanguage problems and performing arithmetic reasoning over quantities with commonsense knowledge. As shown in Figure 1, a typical MWP consists of a short narrative describing a situation in the world and asking a question about an unknown quantity. To solve the MWP in Figure 1, a machine needs to extract key quantities from the text, such as “100 kilometers” and “2 hours”, and understand the relationships between them. General mathematical knowledge like “distance = velocity time” is then used to calculate the solution.
Researchers have recently focused on solving MWPs using neuralsymbolic models Ling et al. (2017); Wang et al. (2017); Huang et al. (2018); Wang et al. (2018); Xie and Sun (2019). These models usually consist of a neural perception module (i.e., Seq2Seq or Seq2Tree) that maps the problem text into a solution expression or tree, and a symbolic module which executes the expression and generates the final answer. Training these models requires the full supervision of the solution expressions.
However, these fullysupervised approaches have three drawbacks. First, current MWP datasets only provide one solution for each problem, while there naturally exist multiple solutions that give different paths of solving the same problem. For instance, the problem in Figure 1 can be solved by “” if we first calculate the speed and then multiply it by the total time; alternatively, we can solve it using “” by summing the distances of the first and second parts of the journey. The models trained with full supervision on current datasets are forced to fit the given solution and cannot generate diverse solutions. Second, annotating the expressions for MWPs is timeconsuming. However, a large amount of MWPs with their final answers can be mined effortlessly from the internet (e.g., online forums). How to efficiently utilize these partiallylabeled data without the supervision of expressions remains an open problem. Third, current supervised learning approaches suffer from the traintest discrepancy. The fullysupervised learning methods optimize expression accuracy rather than answer accuracy. However, the model is evaluated by the answer accuracy on the test set, causing a natural performance gap.
To address these issues, we propose to solve the MWPs with weak supervision, where only the problem texts and the final answers are required. By directly optimizing the answer accuracy rather than the expression accuracy, learning with weak supervision naturally addresses the traintest discrepancy. Our model consists of a treestructured neural model similar to Xie and Sun (2019) to generate the solution tree and a symbolic execution module to calculate the answer. However, the symbolic execution module for arithmetic expressions is nondifferentiable with respect to the answer accuracy, making it infeasible to use backpropagation to compute gradients. A straightforward approach is to employ policy gradient methods like REINFORCE Williams (1992) to train the neural model. The policy gradient methods explore the solution space and update the policy based on generated solutions that happen to hit the correct answer. Since the solution space is large and incorrect solutions are abandoned with zero reward, these methods usually converge slowly or fail to converge.
To improve the efficiency of weaklysupervised learning, we propose a novel fixing mechanism to learn from incorrect predictions, which is inspired by the human ability to learn from failures via abductive reasoning Magnani (2009); Zhou (2019a). The fixing mechanism propagates the error from the root node to the leaf nodes in the solution tree and finds the most probable fix that can generate the desired answer. The fixed solution tree is further used as a pseudo label to train the neural model. Figure 2 shows how the fixing mechanism corrects the wrong solution tree by tracing the error in a topdown manner.
Furthermore, we design two practical techniques to traverse the solution space and discover possible solutions efficiently. First, we observe a positive correlation between the number of quantities in the text and the size of the solution tree (the number of leaf nodes in the tree), and propose a tree regularization technique based on this observation to limit the range of possible tree sizes and shrink the solution space. Second, we adopt a memory buffer to track and save the discovered fixes for each problem with the fixing mechanism. All memory buffer solutions are used as pseudo labels to train the model, encouraging the model to generate more diverse solutions for a single problem.
In summary, by combining the fixing mechanism and the above two techniques, the proposed learningbyfixing (LBF) method contains an exploring stage and a learning stage in each iteration, as shown in Figure 2. We utilize the fixing mechanism and tree regularization to correct wrong answers in the exploring stage and generate fixed expressions as pseudo labels. In the learning stage, we train the neural model using these pseudo labels.
We conduct comprehensive experiments on the Math23K dataset Wang et al. (2017). The proposed LBF method significantly outperforms the reinforcement learning baselines in weaklysupervised learning and achieves comparable performance with several fullysupervised methods. Furthermore, our proposed method achieves significantly better answer accuracies of all the top3/5 answers than fullysupervised methods, illustrating its advantage in generating diverse solutions. The ablative experiments also demonstrate the efficacy of the designed algorithms, including the fixing mechanism, tree regularization, and memory buffer.
Related Work
Math Word Problems
Recently, there emerges various questionanswering tasks that require humanlike reasoning abilities Qi et al. (2015); Tu et al. (2014); Zhang et al. (2019); Dua et al. (2019); Hong et al. (2019); Zhu et al. (2020); Zhang et al. (2020b); Li et al. (2020b); Yu et al. (2020). Among them, solving mathematical word problems (MWPs) is a fundamental and challenging task.
Previous studies of MWPs range from traditional rulebased methods Fletcher (1985); Bakman (2007); Yuhui et al. (2010), statistical learning methods Kushman et al. (2014); Zhou et al. (2015); Mitra and Baral (2016); Roy and Roth (2017); Huang et al. (2016), semanticparsing methods Shi et al. (2015); KoncelKedziorski et al. (2015); Huang et al. (2017)
to recent deep learning methods
Ling et al. (2017); Wang et al. (2017); Huang et al. (2018); Robaidek et al. (2018); Wang et al. (2018, 2019); Chiang and Chen (2019); Xie and Sun (2019); Zhang et al. (2020a).In particular, Deep Neural Solver (DNS) Wang et al. (2017) is a pioneering work that designs a Seq2seq model to solve MWPs and achieves promising results. Xie and Sun (2019) propose a treestructured neural solver to generate the solution tree in a goaldriven manner. All these neural solvers learn the model with full supervision, where the groundtruth intermediate representations (e.g., expressions, programs) are given during training. To learn the solver with less supervision, KoncelKedziorski et al. (2015) use a discriminative model to solve MWPs in a weaklysupervised way. They utilize separate modules to extract features, construct expression trees, and score the likelihood, which is different from the current endtoend neural solvers. Upadhyay et al. (2016), Zhou et al. (2015), and Kushman et al. (2014) use mixed supervision, where one dataset has only annotated equations, and the other has only final answers. However, for the set with final answers, they also depend on predefined equation templates. Chen et al. (2020) apply a neuralsymbolic reader on MathQAAmini et al. (2019), which is a largescale dataset with fullyspecified operational programs. They have access to the ground truth programs for a small fraction of training samples at the first iterations of training.
Unlike these methods, the proposed LBF method requires only the supervision of the final answer and generates diverse solutions by keeping a memory buffer. Notably, it addresses the sparse reward problem in policy gradient methods using a fixing mechanism that propagates error down a solution tree and finds the most probable fix.
NeuralSymbolic Learning for NLP
Neuralsymbolic learning has been applied to solve NLP tasks with weak supervision, such as semantic parsing and program synthesis Liang et al. (2016a); Guu et al. (2017); Liang et al. (2018); Agarwal et al. (2019); Li et al. (2020b). Similar to MWP, they generate intermediate symbolic representations with a neural network and execute the intermediate representation with a symbolic reasoning module to get the final result. Typical approaches for such neuralsymbolic models use policy gradient methods like REINFORCE since the symbolic execution module is nondifferentiable. For example, Neural Symbolic Machines Liang et al. (2016b) combines REINFORCE with a maximumlikelihood training process to find good programs. Guu et al. (2017) augment reinforcement learning with the maximum marginal likelihood so that probability is distributed evenly across consistent programs. Memory Augmented Policy Optimization (MAPO) Liang et al. (2018) formulates its learning objective as an expectation over a memory buffer of highreward samples and a separate expectation outside the buffer, which helps accelerate and stabilize policy gradient training. Meta Reward Learning Agarwal et al. (2019) uses an auxiliary reward function to provide feedback beyond a binary success or failure. Since these methods can only learn from sparse successful samples, they suffer from cold start and inefficient exploration of large search spaces. Recently, Dai and Zhou (2017), Dai et al. (2019), and Zhou (2019b) introduce abductive learning, which states that human misperceptions can be corrected via abductive reasoning. In this paper, we follow the abductive learning method Li et al. (2020a) and propose a novel fixing mechanism to learn from negative samples, significantly accelerating and stabilizing the weaklysupervised learning process. We further design the tree regularization and memory buffer techniques to efficiently shrink and explore the solution space.
WeaklySupervised MWPs
In this section, we define the weaklysupervised math word problems and describe the goaldriven tree model originated from Xie and Sun (2019). Then we introduce the proposed learningbyfixing method, as also shown in Figure 2.
Problem Definition
A math word problem is represented by an input problem text
. The machine learning model with parameters
requires to translate into an intermediate expression , which is executed to compute the final answer . In fullysupervised learning, we learn from the ground truth expression and the final answer . The learning objective is to maximize the data likelihood , where computing given is a deterministic process. In contrast, in the weaklysupervised setting, only and are observed, while is hidden. In other words, the model is required to generate an unknown expression from the problem text. The expression is then executed to get the final answer.Goaldriven TreeStructured Model
A problem text consists of words and numeric values. The model takes in problem text and generates a solution tree . Let denote the ordered list of numeric values in according to their order in the problem text. Generally, may contain constants , mathematical operators , and numeric values from the problem text . Therefore, the target vocabulary of is denoted as and it varies between problems due to different .
To generate the solution tree, we adopt the goaldriven treestructured neural model (GTS) Xie and Sun (2019), which first encodes the problem text into its goal and then recursively decomposes it into subgoals in a topdown manner.
Problem Encoding. Each word of the problem text is encoded into a contextual representation. Specifically, for a problem , each word is first converted to a word embedding . Then the sequence of embeddings is inputted to a bidirectional GRU Cho et al. (2014) to produce a contextual word representation: where are the hidden states of the forward and backward GRUs at position , respectively.
Solution Tree Generation.
The tree generation process is designed as a preorder tree traversal (rootleftright). The root node of the solution tree is initialized with a goal vector
For a node with goal q, we first derive a context vector c by an attention mechanism to summarize relevant information from the problem:
(1)  
(2) 
where and are trainable parameters. Then the goal q and the context c are used to predict the token of this node from the target vocabulary . The probability of token is defined as:
(3)  
(4) 
where is the embedding of token :
(5) 
where and are two trainable embeddings for operators and constants, respectively. For a number token, its embedding is the corresponding hidden state from the encoder, where is the index of in the problem . The predicted token is:
(6) 
If the predicted token is a number token or constant, the node is terminated and its goal is realized by the predicted token; otherwise, the predicted token is an operator and the current goal is decomposed into left and right subgoals combined by the operator. Please refer to the supplementary material for more details about the goal decomposition process.
Answer Calculation. The generated solution tree is transformed into a reasoning tree by creating auxiliary nonterminal nodes in place of the operator nodes to store the intermediate results, and the original operator nodes are attached as child nodes to the corresponding auxiliary nodes. Then the final answer is calculated by executing to the value of the root node in a bottomup manner.
LearningbyFixing
Fixing Mechanism
Drawing inspiration from humans’ ability to correct and learn from failures, we propose a fixing mechanism to correct the wrong solution trees via abductive reasoning following Li et al. (2020a) and use the fixed solution trees as pseudo labels for training. Specifically, we find the most probable fix for the wrong prediction by backtracking the reasoning tree and propagating the error from the root node into the leaf nodes in a topdown manner.
The key ingredient in the fixing mechanism is the 1step fix (1FIX) algorithm which assumes that only one symbol in the reasoning tree can be substituted. As shown by the 1Fix function in Algorithm 1, the 1step fix starts from the root node of the reasoning tree and gradually searches down to find a fix that makes the final output equal to the groundtruth. The search process is implemented with a priority queue, where each element is defined as a fixtuple :

[leftmargin=*,noitemsep]

is the current visiting node.

is the expected value on this node, which means if the value of is changed to , will execute to the groundtruth answer .

is the visiting priority, which reflects the probability of changing the value of .
In 1FIX, error propagation through the solution tree is achieved by a function, which aims at computing the expected value of a child node from its parent’s expected value. Supposing is ’s child node and is the expected value of , the function works as following:

[leftmargin=*,noitemsep]

If is ’s left or right child, we directly solve the equation or to get ’s expected value , where denotes the operator.

If is an operator node, we try to replace with all other operators and check whether the new expression can generate the correct answer. That is, where is now an operator. If there is no satisfying this equation, the solve function returns none.
Please refer to the supplementary material for the definition of the visiting priority as well as the illustrative example of the 1FIX process.
To search the neighbors of within multistep distance, we extend the 1step fix to multistep by incorporating a RandomWalk function. As shown in Algorithm 1, if we find a fix by 1FIX, we return this fix; otherwise, we randomly change one leaf node in the reasoning tree to another symbol within the same set (e.g., operators ) based on the probability in Equation 4. This process will be repeated for certain iterations until it finds a fix for the solution.
Solution Space Exploration
Tree Regularization While Li et al. (2020a) assumes the length of the intermediate representation is given, the expression length is unknown in weaklysupervised learning. Thus, the original solution space is infinite since the predicted token decides whether to continue the generation or stop. Therefore, it is critical to shrink the solution space, i.e., control the size of the generated solution trees. If the size of the generated solution tree varies a lot from the target size, it would be challenging for the solution or its fix to hit the correct answer. Although the target size is unknown, we observe a positive correlation between the target size and the number of quantities in text. Regarding this observation as a tree size prior, we design a tree regularization algorithm to generate a solution tree with a target size and regularize the size in an empirical range. Denote the size of a solution tree as the number of leaf nodes including quantities, constants, and operators. The prior range of given the length of the numeric value list is defined as:
(7) 
where
are the hyperparameters. The effect of these hyperparameters will be discussed in
Table 2.We further propose a tree regularization algorithm to decode a solution tree with a given size. To generate a tree of a given size , we design two rules to produce a prefixorder expression during the preorder tree decoding:

[leftmargin=*,noitemsep]

The number of operators cannot be greater than .

Except the th position, the number of numeric values (quantities and constants) cannot be greater than the number of operators.
These two rules are inspired by the syntax of prefix notation (a.k.a, normal Polish notation) for mathematical expressions. The rules shrink the target vocabulary in Equation 6 so that the tree generation can be stopped when it reaches the target size. Figure 3 shows illustrative examples of the tree regularization algorithm.
With tree regularization, we can search the possible fixes within a given range of tree size for each problem.
Memory Buffer. We adopt a memory buffer to track and save the discovered fixes for each problem. The memory buffer enables us to seek multiple solutions for a single problem and use all of them as pseudo labels for training, which encourages diverse solutions. Formally, given a problem and its buffer , the learning objective is to minimize the negative loglikelihood of all fixed expressions in the buffer:
(8) 
LearningbyFixing Framework
The complete learningbyfixing method is described in Algorithm 2. In the exploring state, we use the fixing mechanism and tree regularization to discover possible fixes for the wrong trees generated by the neural network, and put them into a buffer. In the learning stage, we train the model with all the solutions in the memory buffer by minimizing the loss function in Equation 8.
Experimental Results
Experimental Setup
Dataset. We evaluate our proposed method on the Math23K dataset Wang et al. (2017). It contains 23,161 math word problems annotated with solution expressions and answers. For the weaklysupervised setting, we only use the problems and final answers and discard the expressions. We do crossvalidation following the setting of Xie and Sun (2019).
Evaluation Metric. We evaluate the model performance by answer accuracy, where the generated solution is considered correct if it executes to the groundtruth answer. Specifically, we report answer accuracies of all the top predictions using beam search. It evaluates the model’s ability to generate multiple possible solutions.
Models. We conduct experiments by comparing our methods with variants of weaklysupervised learning methods. Specifically, we experiment with two inference models: Seq2Seq with bidirectional Long Short Memory network (BiLSTM) Wu et al. (2016) and GTS Xie and Sun (2019), and train with four learning strategies: REINFORCE, MAPO Liang et al. (2018), LBF, LBFw/oM (without memory buffer). MAPO is a stateoftheart method in semantic parsing task that extends the REINFORCE with augmented memory. Both models are also trained with the tree regularization algorithm. We also compare with the fullysupervised learning methods to demonstrate our superiority in generating diverse solutions. In the ablative studies, we analyze the effect of the proposed tree regularization and the length of search steps in fixing mechanism.
Comparisons with Stateoftheart
Table 1 summarizes the answer accuracy of different weaklysupervised learning methods and the stateoftheart fullysupervised approaches. The proposed learningbyfixing framework significantly outperforms the policy gradient baselines like REINFORCE and MAPO, on both the Seq2seq and the GTS models. It demonstrates the strength of our proposed LBF method in weaklysupervised learning. The GTSLBFfully model is trained by initializing the memory buffer with all the groundtruth expressions. It demonstrates that by extending to the fullysupervised setting, our model maintains the top1 accuracy while significantly improving solutions’ diversity. We believe that learning MWPs with weak supervision is a promising direction. It requires fewer annotations and allows us to build larger datasets with less cost.
Model  Accuracy(%)  
FullySupervised  
Retrieval Robaidek et al. (2018)  47.2  
Classification Robaidek et al. (2018)  57.9  
LSTM Robaidek et al. (2018)  51.9  
CNN Robaidek et al. (2018)  42.3  
DNS Wang et al. (2017)  58.1  
Seq2seqET Wang et al. (2018)  66.7  
StackDecoder Chiang and Chen (2019)  65.8  
TRNN Wang et al. (2019)  66.9  
GTS Xie and Sun (2019)  74.3  
Graph2Tree Zhang et al. (2020a)  74.8 ^{1}^{1}1 We run the code using the same setting as GTS for three times and compute the average accuracy.  
GTSLBFfully  74.1  
WeaklySupervised  
Seq2seq  REINFORCE  1.2 
MAPO  10.7  
LBFw/oM  44.7  
LBF  43.6  
GTS  REINFORCE  15.8 
MAPO  20.8  
LBFw/oM  58.3  
LBF  59.4 
Convergence Speed
Figure 4 shows the learning curves of different weaklysupervised learning methods for the GTS model. The proposed LBF method converges significantly faster and achieves higher accuracy compared with other methods. Both the REINFORCE and MAPO take a long time to start improving, which indicates the policy gradient methods suffer from the coldstart and need time to accumulate rewarding samples.
Diverse Solutions with Memory Buffer
To evaluate the ability to generate diverse solutions, we report the answer accuracies of all the top1/3/5 solutions on the test set using beam search, denoted as Acc@1/3/5, as shown in Table 2. In the weaklysupervised scenario, GTSLBF achieves slightly better Acc@1 accuracy and much better Acc@3/5 accuracy than GTSLBFw/oM. In the fully supervised scenario, GTSLBFfully achieves comparable Acc@1 accuracy and much better Acc@3/5 accuracy than the original GTS model. Particularly, GTSLBFfully outperforms GTS by 21% and 26% in terms of Acc@3/5 accuracy. It reveals the efficacy of the memory buffer in encouraging diverse solutions in both weaklysupervised learning and fullysupervised learning.
Model  Tree Size  Acc@1  Acc@3  Acc@5 
Fully Supervised  
GTS  74.3  42.2  30.0  
GTSLBFfully  74.1  63.4  56.3  
Weakly Supervised  
GTSLBF w/oM  [1,)  0  0  0 
[2n1,2n+1]  55.3  26.2  19.3  
[2n1,2n+3]  58.3  27.7  20.3  
[2n3,2n+5]  56.7  27.7  20.6  
GTSLBF  [1,)  0  0  0 
[2n1,2n+1]  56.7  45.3  39.1  
[2n1,2n+3]  59.4  49.6  45.2  
[2n3,2n+5]  57.6  49.3  45.2 
Qualitative Analysis
We visualize several examples of the top5 predictions of GTSLBF in Figure 5. In the first example, the first solution generated by our model is to sum up the prices of a table and a chair first, and then multiply it by the number of pairs of tables and chairs. Our model can also produce another reasonable solution (the fifth column) by deriving the prices of tables and chairs separately and then summing them up.
One caveat for the multiple solutions is that some solutions have different solution trees but are equivalent by switching the order of numeric values or subtrees, as shown in the first four solutions of the first problem in Figure 5. In particular, multiplication and addition are commutative, and our model learns and exploits this property to generate equivalent solutions with different tree structures.
Right  Wrong  Spurious  
Acc@1  58.6  40.6  0.56 
Acc@3  49.3  50.4  0.27 
Acc@5  44.9  54.8  0.32 
The first solution to the fourth problem in Figure 5 is a typical error case of our model due to the wrong prediction of the problem goal. Another failure type is the spurious solutions, which are correct but not meaningful answers, such as the second solution of the third problem in Figure 5. To test how frequent the spurious solutions appear, we randomly select 500 examples from the test set, and ask three human annotators to determine whether each generated expression is right, wrong, or spurious. Table 3 provides the human evaluation results, and it shows that spurious solutions are rare in our model.
Ablative Analyses
Tree Regularization.
We test different choices of the hyperparameters defined by Equation 7 in tree regularization. As shown in Table 2, the model without tree regularization, i.e., tree size , fails to converge and gets nearly 0 accuracy. The best range for the solution tree size is , where . We provide an intuitive interpretation of this range: for a problem with quantities, (1) operators are needed to connect quantities, which leads to the lower bound of tree size to ; (2) in certain cases, the constants or quantities are used more than once, leading to a rough upper bound of . Therefore, we use as the default range in our implementations. Empirically, this range covers 88% of the lengths of the given groundtruth expressions in the Math23K dataset, providing an efficient prior for tree size.
Number of Search Steps
Table 4 shows the comparison of various step lengths in the mFIX algorithm. In most cases, increasing the step length improves the chances of correcting wrong solutions, thus improving the performance.
[33mm]ModelsSteps  1  10  50 (default)  100 
Seq2seqLBFw/oM  41.9  43.4  44.7  47.8 
Seq2seqLBF  43.9  45.7  43.6  44.6 
GTSLBFw/oM  51.2  54.6  58.3  57.8 
GTSLBF  52.5  55.8  59.4  59.6 
Conclusion
In this work, we propose a weaklysupervised paradigm for learning MWPs and a novel learningbyfixing framework to boost the learning. Our method endows the MWP learner with the capability of learning from wrong solutions, thus significantly improving the answer accuracy and learning efficiency. One future direction of the proposed model is to prevent generating equivalent or spurious solutions during training, possibly by making the generated solution trees more interpretable with semantic constraints.
Ethical Impact
The presented work should be categorized as research in the field of weaklysupervised learning and abductive reasoning. It can help teachers in school get various solutions of a math word problem. This work may also inspire new algorithmic, theoretical, and experimental investigation in neuralsymbolic methods and NLP tasks.
Acknowledgement
This work reported herein is supported by ARO W911NF1810296, DARPA XAI N660011724029, and ONR MURI N000141612007.
References
 Learning to generalize from sparse and underspecified rewards. In ICML, Cited by: NeuralSymbolic Learning for NLP.
 MathQA: towards interpretable math word problem solving with operationbased formalisms. In NAACLHLT, Cited by: Math Word Problems.
 Robust understanding of word problems with extraneous information. Cited by: Math Word Problems.
 Neural symbolic reader: scalable integration of distributed and symbolic representations for reading comprehension. In ICLR, Cited by: Math Word Problems.
 Semanticallyaligned equation generation for solving and reasoning math word problems. ArXiv abs/1811.00720. Cited by: Math Word Problems, Table 1.
 Learning phrase representations using rnn encoderdecoder for statistical machine translation. EMNLP. Cited by: Goaldriven TreeStructured Model.
 Bridging machine learning and logical reasoning by abductive learning. In Advances in Neural Information Processing Systems, pp. 2811–2822. Cited by: NeuralSymbolic Learning for NLP.

Combining logical abduction and statistical induction: discovering written primitives with human knowledge.
In
ThirtyFirst AAAI Conference on Artificial Intelligence
, Cited by: NeuralSymbolic Learning for NLP.  DROP: a reading comprehension benchmark requiring discrete reasoning over paragraphs. In NAACLHLT, Cited by: Math Word Problems.
 Understanding and solving arithmetic word problems: a computer simulation. Behavior Research Methods, Instruments, & Computers 17, pp. 565–571. Cited by: Math Word Problems.
 From language to programs: bridging reinforcement learning and maximum marginal likelihood. In ACL, Cited by: NeuralSymbolic Learning for NLP.
 Academic reader: an interactive question answering system on academic literatures. ThirtyThird AAAI Conference on Artificial Intelligence. Cited by: Math Word Problems.
 Neural math word problem solver with reinforcement learning. In COLING, Cited by: Introduction, Math Word Problems.
 How well do computers solve math word problems? largescale dataset construction and evaluation. In ACL, Cited by: Math Word Problems.
 Learning finegrained expressions to solve math word problems. In EMNLP, Cited by: Math Word Problems.
 Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, pp. 585–597. Cited by: Math Word Problems, Math Word Problems.
 Learning to automatically solve algebra word problems. In ACL, Cited by: Math Word Problems, Math Word Problems.
 Closed loop neuralsymbolic learning via integrating neural perception, grammar parsing, and symbolic reasoning. In International Conference on Machine Learning (ICML), Cited by: NeuralSymbolic Learning for NLP, Fixing Mechanism, Solution Space Exploration.

A competenceaware curriculum for visual concepts learning via question answering.
In
European Conference on Computer Vision
, Cited by: Math Word Problems, NeuralSymbolic Learning for NLP.  Neural symbolic machines: learning semantic parsers on freebase with weak supervision. arXiv preprint arXiv:1611.00020. Cited by: NeuralSymbolic Learning for NLP.
 Neural symbolic machines: learning semantic parsers on freebase with weak supervision. In ACL, Cited by: NeuralSymbolic Learning for NLP.
 Memory augmented policy optimization for program synthesis and semantic parsing. In NeurIPS, Cited by: NeuralSymbolic Learning for NLP, Experimental Setup.
 Program induction by rationale generation: learning to solve and explain algebraic word problems. ArXiv abs/1705.04146. Cited by: Introduction, Math Word Problems.
 Abductive cognition: the epistemological and ecocognitive dimensions of hypothetical reasoning. Vol. 3, Springer Science & Business Media. Cited by: Introduction.
 Learning to use formulas to solve simple arithmetic problems. In ACL, Cited by: Math Word Problems.
 A restricted visual turing test for deep scene and event understanding. ArXiv abs/1512.01715. Cited by: Math Word Problems.
 Datadriven methods for solving algebra word problems. ArXiv abs/1804.10718. Cited by: Math Word Problems, Table 1.
 Unit dependency graph and its application to arithmetic word problem solving. In AAAI, Cited by: Math Word Problems.
 Automatically solving number word problems by semantic parsing and reasoning. In EMNLP, Cited by: Math Word Problems.
 Joint video and text parsing for understanding events and answering queries. IEEE MultiMedia 21, pp. 42–70. Cited by: Math Word Problems.
 Learning from explicit and implicit supervision jointly for algebra word problems. In EMNLP, Cited by: Math Word Problems.
 Translating math word problem to expression tree. In EMNLP, Cited by: Introduction, Math Word Problems, Table 1.
 Templatebased math word problem solvers with recursive neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 33 (01), pp. 7144–7151. Cited by: Math Word Problems, Table 1.
 Deep neural solver for math word problems. Copenhagen, Denmark, pp. 845–854. Cited by: Introduction, Introduction, Math Word Problems, Math Word Problems, Experimental Setup, Table 1.
 Simple statistical gradientfollowing algorithms for connectionist reinforcement learning. Machine learning 8 (34), pp. 229–256. Cited by: Introduction.

Google’s neural machine translation system: bridging the gap between human and machine translation
. ArXiv abs/1609.08144. Cited by: Experimental Setup.  A goaldriven treestructured neural model for math word problems. In IJCAI, Cited by: Introduction, Introduction, Math Word Problems, Math Word Problems, Goaldriven TreeStructured Model, WeaklySupervised MWPs, Experimental Setup, Experimental Setup, Table 1.
 ReClor: a reading comprehension dataset requiring logical reasoning. ArXiv abs/2002.04326. Cited by: Math Word Problems.
 Framebased calculus of solving arithmetic multistep addition and subtraction word problems. 2010 Second International Workshop on Education Technology and Computer Science 2, pp. 476–479. Cited by: Math Word Problems.

RAVEN: a dataset for relational and analogical visual reasoning.
2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
, pp. 5312–5322. Cited by: Math Word Problems.  Graphtotree learning for solving math word problems. ACL 2020. Cited by: Math Word Problems, Table 1.
 Machine number sense: a dataset of visual arithmetic problems for abstract and relational reasoning. ArXiv abs/2004.12193. Cited by: Math Word Problems.
 Learn to solve algebra word problems using quadratic programming. In EMNLP, Cited by: Math Word Problems, Math Word Problems.
 Abductive learning: towards bridging machine learning and logical reasoning. Science China Information Sciences 62, pp. 1–3. Cited by: Introduction.
 Abductive learning: towards bridging machine learning and logical reasoning. Science China Information Sciences 62, pp. 1–3. Cited by: NeuralSymbolic Learning for NLP.
 Dark, beyond deep: a paradigm shift to cognitive ai with humanlike common sense. Engineering. Cited by: Math Word Problems.
Comments
There are no comments yet.