Modular Tree Network for Source Code Representation Learning

04/01/2021
by   Wenhan Wang, et al.
0

Learning representation for source code is a foundation of many program analysis tasks. In recent years, neural networks have already shown success in this area, but most existing models did not make full use of the unique structural information of programs. Although abstract syntax tree-based neural models can handle the tree structure in the source code, they cannot capture the richness of different types of substructure in programs. In this paper, we propose a modular tree network (MTN) which dynamically composes different neural network units into tree structures based on the input abstract syntax tree. Different from previous tree-structural neural network models, MTN can capture the semantic differences between types of ASTsubstructures. We evaluate our model on two tasks: program classification and code clone detection. Ourmodel achieves the best performance compared with state-of-the-art approaches in both tasks, showing the advantage of leveraging more elaborate structure information of the source code.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/31/2020

On the Generalizability of Neural Program Analyzers with respect to Semantic-Preserving Program Transformations

With the prevalence of publicly available source code repositories to tr...
research
03/09/2019

Program Classification Using Gated Graph Attention Neural Network for Online Programming Service

The online programing services, such as Github,TopCoder, and EduCoder, h...
research
12/08/2020

Learning to Represent Programs with Heterogeneous Graphs

Program source code contains complex structure information, which can be...
research
11/14/2021

Code Representation Learning with Prüfer Sequences

An effective and efficient encoding of the source code of a computer pro...
research
02/07/2020

What You See is What it Means! Semantic Representation Learning of Code based on Visualization and Transfer Learning

Recent successes in training word embeddings for NLP tasks have encourag...
research
08/29/2018

Retrieval-Based Neural Code Generation

In models to generate program source code from natural language, represe...
research
07/14/2019

Automatic Repair and Type Binding of Undeclared Variables using Neural Networks

Deep learning had been used in program analysis for the prediction of hi...

Please sign up or login with your details

Forgot password? Click here to reset