Dynamic Neural Program Embedding for Program Repair

11/20/2017
by   Ke Wang, et al.
0

Neural program embeddings have shown much promise recently for a variety of program analysis tasks, including program synthesis, program repair, fault localization, etc. However, most existing program embeddings are based on syntactic features of programs, such as raw token sequences or abstract syntax trees. Unlike images and text, a program has an unambiguous semantic meaning that can be difficult to capture by only considering its syntax (i.e. syntactically similar pro- grams can exhibit vastly different run-time behavior), which makes syntax-based program embeddings fundamentally limited. This paper proposes a novel semantic program embedding that is learned from program execution traces. Our key insight is that program states expressed as sequential tuples of live variable values not only captures program semantics more precisely, but also offer a more natural fit for Recurrent Neural Networks to model. We evaluate different syntactic and semantic program embeddings on predicting the types of errors that students make in their submissions to an introductory programming class and two exercises on the CodeHunt education platform. Evaluation results show that our new semantic program embedding significantly outperforms the syntactic program embeddings based on token sequences and abstract syntax trees. In addition, we augment a search-based program repair system with the predictions obtained from our se- mantic embedding, and show that search efficiency is also significantly improved.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/20/2022

Unleashing the Power of Compiler Intermediate Representation to Enhance Neural Program Embeddings

Neural program embeddings have demonstrated considerable promise in a ra...
research
06/28/2022

InvAASTCluster: On Applying Invariant-Based Program Clustering to Introductory Programming Assignments

Due to the vast number of students enrolled in Massive Open Online Cours...
research
03/19/2016

Automated Correction for Syntax Errors in Programming Assignments using Recurrent Neural Networks

We present a method for automatically generating repair feedback for syn...
research
03/18/2018

Code Vectors: Understanding Programs Through Embedded Abstracted Symbolic Traces

With the rise of machine learning, there is a great deal of interest in ...
research
02/17/2022

Grammar-Based Grounded Lexicon Learning

We present Grammar-Based Grounded Lexicon Learning (G2L2), a lexicalist ...
research
07/11/2022

Program Adverbs and Tlön Embeddings

Free monads (and their variants) have become a popular general-purpose t...
research
04/10/2018

Semantic embeddings for program behavior patterns

In this paper, we propose a new feature extraction technique for program...

Please sign up or login with your details

Forgot password? Click here to reset