MARFCAT: Transitioning to Binary and Larger Data Sets of SATE IV

07/16/2012
by   Serguei A. Mokhov, et al.
0

We present a second iteration of a machine learning approach to static code analysis and fingerprinting for weaknesses related to security, software engineering, and others using the open-source MARF framework and the MARFCAT application based on it for the NIST's SATE IV static analysis tool exposition workshop's data sets that include additional test cases, including new large synthetic cases. To aid detection of weak or vulnerable code, including source or binary on different platforms the machine learning approach proved to be fast and accurate to for such tasks where other tools are either much slower or have much smaller recall of known vulnerabilities. We use signal and NLP processing techniques in our approach to accomplish the identification and classification tasks. MARFCAT's design from the beginning in 2010 made is independent of the language being analyzed, source code, bytecode, or binary. In this follow up work with explore some preliminary results in this area. We evaluated also additional algorithms that were used to process the data.

READ FULL TEXT
research
11/02/2017

BinPro: A Tool for Binary Source Code Provenance

Enforcing open source licenses such as the GNU General Public License (G...
research
07/11/2018

Automated Vulnerability Detection in Source Code Using Deep Representation Learning

Increasing numbers of software vulnerabilities are discovered every year...
research
10/18/2021

A Survey on Machine Learning Techniques for Source Code Analysis

Context: The advancements in machine learning techniques have encouraged...
research
05/07/2021

Detecting Security Fixes in Open-Source Repositories using Static Code Analyzers

The sources of reliable, code-level information about vulnerabilities th...
research
02/14/2018

Automated software vulnerability detection with machine learning

Thousands of security vulnerabilities are discovered in production softw...
research
09/20/2021

To Automatically Map Source Code Entities to Architectural Modules with Naive Bayes

Background: The process of mapping a source code entity onto an architec...
research
11/12/2018

A Fine-Grained Approach for Automated Conversion of JUnit Assertions to English

Converting source or unit test code to English has been shown to improve...

Please sign up or login with your details

Forgot password? Click here to reset