Predicting Vulnerability In Large Codebases With Deep Code Representation

04/24/2020
by   Anshul Tanwar, et al.
0

Currently, while software engineers write code for various modules, quite often, various types of errors - coding, logic, semantic, and others (most of which are not caught by compilation and other tools) get introduced. Some of these bugs might be found in the later stage of testing, and many times it is reported by customers on production code. Companies have to spend many resources, both money and time in finding and fixing the bugs which would have been avoided if coding was done right. Also, concealed flaws in software can lead to security vulnerabilities that potentially allow attackers to compromise systems and applications. Interestingly, same or similar issues/bugs, which were fixed in the past (although in different modules), tend to get introduced in production code again. We developed a novel AI-based system which uses the deep representation of Abstract Syntax Tree (AST) created from the source code and also the active feedback loop to identify and alert the potential bugs that could be caused at the time of development itself i.e. as the developer is writing new code (logic and/or function). This tool integrated with IDE as a plugin would work in the background, point out existing similar functions/code-segments and any associated bugs in those functions. The tool would enable the developer to incorporate suggestions right at the time of development, rather than waiting for UT/QA/customer to raise a defect. We assessed our tool on both open-source code and also on Cisco codebase for C and C++ programing language. Our results confirm that deep representation of source code and the active feedback loop is an assuring approach for predicting security and other vulnerabilities present in the code.

READ FULL TEXT

Authors

page 1

page 2

page 3

02/14/2018

Automated software vulnerability detection with machine learning

Thousands of security vulnerabilities are discovered in production softw...
07/04/2021

From Library Portability to Para-rehosting: Natively Executing Microcontroller Software on Commodity Hardware

Finding bugs in microcontroller (MCU) firmware is challenging, even for ...
10/05/2021

SiliFuzz: Fuzzing CPUs by proxy

CPUs are becoming more complex with every generation, at both the logica...
01/18/2022

BinGo: Pinpointing Concurrency Bugs in Go via Binary Analysis

Golang (also known as Go for short) has become popular in building concu...
02/26/2020

Is the OWASP Top 10 list comprehensive enough for writing secure code?

The OWASP Top 10 is a list that is published by the Open Web Application...
03/09/2021

How to integrate with real cars – minimizing lead time at Volkswagen

The most successful tech companies of the world release new software ver...
07/16/2021

Loop Transformations using Clang's Abstract Syntax Tree

OpenMP 5.1 introduced the first loop nest transformation directives unro...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.