Dependency-Based Neural Representations for Classifying Lines of Programs

04/08/2020
by   Shashank Srikant, et al.
0

We investigate the problem of classifying a line of program as containing a vulnerability or not using machine learning. Such a line-level classification task calls for a program representation which goes beyond reasoning from the tokens present in the line. We seek a distributed representation in a latent feature space which can capture the control and data dependencies of tokens appearing on a line of program, while also ensuring lines of similar meaning have similar features. We present a neural architecture, Vulcan, that successfully demonstrates both these requirements. It extracts contextual information about tokens in a line and inputs them as Abstract Syntax Tree (AST) paths to a bi-directional LSTM with an attention mechanism. It concurrently represents the meanings of tokens in a line by recursively embedding the lines where they are most recently defined. In our experiments, Vulcan compares favorably with a state-of-the-art classifier, which requires significant preprocessing of programs, suggesting the utility of using deep learning to model program dependence information.

READ FULL TEXT
research
02/24/2022

Pushing Blocks by Sweeping Lines

We investigate the reconfiguration of n blocks, or "tokens", in the squa...
research
09/08/2020

Predicting Defective Lines Using a Model-Agnostic Technique

Defect prediction models are proposed to help a team prioritize source c...
research
03/26/2018

A General Path-Based Representation for Predicting Program Properties

Predicting program properties such as names or expression types has a wi...
research
10/23/2018

Ain't Nobody Got Time For Coding: Structure-Aware Program Synthesis From Natural Language

Program synthesis from natural language (NL) is practical for humans and...
research
03/29/2022

Semantic Line Detection Using Mirror Attention and Comparative Ranking and Matching

A novel algorithm to detect semantic lines is proposed in this paper. We...
research
05/29/2019

The (Non-)Utility of Structural Features in BiLSTM-based Dependency Parsers

Classical non-neural dependency parsers put considerable effort on the d...
research
05/19/2022

Line Planning in Public Transport: Bypassing Line Pool Generation

Line planning, i.e. choosing paths which are operated by one vehicle end...

Please sign up or login with your details

Forgot password? Click here to reset