What If: Generating Code to Answer Simulation Questions

by   Gal Peretz, et al.

Many texts, especially in chemistry and biology, describe complex processes. We focus on texts that describe a chemical reaction process and questions that ask about the process's outcome under different environmental conditions. To answer questions about such processes, one needs to understand the interactions between the different entities involved in the process and to simulate their state transitions during the process execution under different conditions. A state transition is defined as the memory modification the program does to the variables during the execution. We hypothesize that generating code and executing it to simulate the process will allow answering such questions. We, therefore, define a domain-specific language (DSL) to represent processes. We contribute to the community a unique dataset curated by chemists and annotated by computer scientists. The dataset is composed of process texts, simulation questions, and their corresponding computer codes represented by the DSL.We propose a neural program synthesis approach based on reinforcement learning with a novel state-transition semantic reward. The novel reward is based on the run-time semantic similarity between the predicted code and the reference code. This allows simulating complex process transitions and thus answering simulation questions. Our approach yields a significant boost in accuracy for simulation questions: 88% accuracy as opposed to 83% accuracy of the state-of-the-art neural program synthesis approaches and 54% accuracy of state-of-the-art end-to-end text-based approaches.


page 1

page 2

page 3

page 4


Representing, reasoning and answering questions about biological pathways - various applications

Biological organisms are composed of numerous interconnected biochemical...

Solving Linear Algebra by Program Synthesis

We solve MIT's Linear Algebra 18.06 course and Columbia University's Com...

Few-Shot Complex Knowledge Base Question Answering via Meta Reinforcement Learning

Complex question-answering (CQA) involves answering complex natural-lang...

Mintaka: A Complex, Natural, and Multilingual Dataset for End-to-End Question Answering

We introduce Mintaka, a complex, natural, and multilingual dataset desig...

Benchmark Visual Question Answer Models by using Focus Map

Inferring and Executing Programs for Visual Reasoning proposes a model f...

A Dataset and Benchmark for Automatically Answering and Generating Machine Learning Final Exams

Can a machine learn machine learning? We propose to answer this question...

Encoding Petri Nets in Answer Set Programming for Simulation Based Reasoning

One of our long term research goals is to develop systems to answer real...

Please sign up or login with your details

Forgot password? Click here to reset