An Evalutation of Programming Language Models' performance on Software Defect Detection

09/10/2019
by   Kailun Wang, et al.
0

This dissertation presents an evaluation of several language models on software defect datasets. A language Model (LM) "can provide word representation and probability indication of word sequences as the core component of an NLP system." Language models for source code are specified for tasks in the software engineering field. While some models are directly the NLP ones, others contain structural information that is uniquely owned by source code. Software defects are defects in the source code that lead to unexpected behaviours and malfunctions at all levels. This study provides an original attempt to detect these defects at three different levels (syntactical, algorithmic and general) We also provide a tool chain that researchers can use to reproduce the experiments. We have tested the different models against different datasets, and performed an analysis over the results. Our original attempt to deploy bert, the state-of-the-art model for multitasks, leveled or outscored all other models compared.

READ FULL TEXT
research
03/06/2020

Code Obfuscation for the C/C++ Language

Obfuscation is the action of making something unintelligible. In softwar...
research
08/28/2022

Measuring design compliance using neural language models – an automotive case study

As the modern vehicle becomes more software-defined, it is beginning to ...
research
09/05/2023

Language Models for Novelty Detection in System Call Traces

Due to the complexity of modern computer systems, novel and unexpected b...
research
10/06/2021

Capturing Structural Locality in Non-parametric Language Models

Structural locality is a ubiquitous feature of real-world datasets, wher...
research
07/05/2023

An Exploratory Literature Study on Sharing and Energy Use of Language Models for Source Code

Large language models trained on source code can support a variety of so...
research
06/08/2023

Mapping Brains with Language Models: A Survey

Over the years, many researchers have seemingly made the same observatio...
research
06/05/2023

LmPa: Improving Decompilation by Synergy of Large Language Model and Program Analysis

Decompilation aims to recover the source code form of a binary executabl...

Please sign up or login with your details

Forgot password? Click here to reset