Convolutional Neural Networks over Tree Structures for Programming Language Processing

09/18/2014
by   Lili Mou, et al.
0

Programming language processing (similar to natural language processing) is a hot research topic in the field of software engineering; it has also aroused growing interest in the artificial intelligence community. However, different from a natural language sentence, a program contains rich, explicit, and complicated structural information. Hence, traditional NLP models may be inappropriate for programs. In this paper, we propose a novel tree-based convolutional neural network (TBCNN) for programming language processing, in which a convolution kernel is designed over programs' abstract syntax trees to capture structural information. TBCNN is a generic architecture for programming language processing; our experiments show its effectiveness in two different program analysis tasks: classifying programs according to functionality, and detecting code snippets of certain patterns. TBCNN outperforms baseline methods, including several neural models for NLP.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/27/2019

TreeCaps: Tree-Structured Capsule Networks for Program Source Code Processing

Program comprehension is a fundamental task in software development and ...
research
10/03/2018

AST-Based Deep Learning for Detecting Malicious PowerShell

With the celebrated success of deep learning, some attempts to develop e...
research
07/02/2018

Semantic Query Language for Temporal Genealogical Trees

Computers play a crucial role in modern ancestry management, they are us...
research
03/02/2021

Implementing G-Machine in HyperLMNtal

Since language processing systems generally allocate/discard memory with...
research
02/11/2020

Modeling Programs Hierarchically with Stack-Augmented LSTM

Programming language modeling has attracted extensive attention in recen...
research
08/20/2021

Fex: Assisted Identification of Domain Features from C Programs

Modern software typically performs more than one functionality. These fu...
research
11/14/2018

A Grammar-Based Structural CNN Decoder for Code Generation

Code generation maps a program description to executable source code in ...

Please sign up or login with your details

Forgot password? Click here to reset