Learning to Represent Programs with Graphs

11/01/2017
by   Miltiadis Allamanis, et al.
0

Learning tasks on source code (i.e., formal languages) have been considered recently, but most work has tried to transfer natural language methods and does not capitalize on the unique opportunities offered by code's known syntax. For example, long-range dependencies induced by using the same variable or function in distant locations are often not considered. We propose to use graphs to represent both the syntactic and semantic structure of code and use graph-based deep learning methods to learn to reason over program structures. In this work, we present how to construct graphs from source code and how to scale Gated Graph Neural Networks training to such large graphs. We evaluate our method on two tasks: VarNaming, in which a network attempts to predict the name of a variable given its usage, and VarMisuse, in which the network learns to reason about selecting the correct variable that should be used at a given program location. Our comparison to methods that use less structured program representations shows the advantages of modeling known structure, and suggests that our models learn to infer meaningful names and to solve the VarMisuse task in many cases. Additionally, our testing showed that VarMisuse identifies a number of bugs in mature open-source projects.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/08/2020

Learning to Represent Programs with Heterogeneous Graphs

Program source code contains complex structure information, which can be...
research
03/03/2019

CodeGRU: Context-aware Deep Learning with Gated Recurrent Unit for Source Code Modeling

Recently many NLP-based deep learning models have been applied to model ...
research
05/22/2018

Generative Code Modeling with Graphs

Generative models for source code are an interesting structured predicti...
research
05/18/2020

Learning Semantic Program Embeddings with GraphInterval Neural Network

Learning distributed representations of source code has been a challengi...
research
05/18/2020

Learning Semantic Program Embeddings with Graph Interval Neural Network

Learning distributed representations of source code has been a challengi...
research
03/24/2021

deGraphCS: Embedding Variable-based Flow Graph for Neural Code Search

With the rapid increase in the amount of public code repositories, devel...
research
10/29/2018

Deep learning long-range information in undirected graphs with wave networks

Graph algorithms are key tools in many fields of science and technology....

Please sign up or login with your details

Forgot password? Click here to reset