Twin-Finder: Integrated Reasoning Engine for Pointer-related Code Clone Detection

11/01/2019
by   Hongfa Xue, et al.
0

Detecting code clones is crucial in various software engineering tasks. In particular, code clone detection can have significant uses in the context of analyzing and fixing bugs in large scale applications. However, prior works, such as machine learning based clone detection, may cause a considerable amount of false positives. In this paper, we propose Twin-Finder, a novel, closed-loop approach for pointer-related code clone detection that integrates machine learning and symbolic execution techniques to achieve precision. Twin-Finder introduces a clone verification mechanism to formally verify if tow clone samples are indeed clones and a feedback loop to automatically generated formal rules to tune machine learning algorithm and further reduce the false positives. Our experimental results show Twin-Finder that can swiftly identify up 9X more code clones comparing to conventional code clone detection approaches. We conduct security analysis for memory safety using real-world applications Links version 2.14 and libreOffice-6.0.0.1. Twin-Finder is able to find 6 unreported bugs in Links version 2.14 and one public patched bug in libreOffice-6.0.0.1.

READ FULL TEXT

page 1

page 13

research
03/08/2022

Learning to Reduce False Positives in Analytic Bug Detectors

Due to increasingly complex software design and rapid iterative developm...
research
01/22/2021

MAVERICK: Proactively detecting network control plane bugs using structural outlierness

Proactive detection of network configuration bugs is important to ensure...
research
05/03/2020

A Machine Learning Based Framework for Code Clone Validation

A code clone is a pair of code fragments, within or between software sys...
research
11/17/2020

Automatic Microprocessor Performance Bug Detection

Processor design validation and debug is a difficult and complex task, w...
research
07/19/2023

Code Detection for Hardware Acceleration Using Large Language Models

Large language models (LLMs) have been massively applied to many tasks, ...
research
05/08/2023

Modelling Concurrency Bugs Using Machine Learning

Artificial Intelligence has gained a lot of traction in the recent years...
research
05/09/2018

Evaluating Manual Intervention to Address the Challenges of Bug Finding with KLEE

Symbolic execution has shown its ability to find security-relevant flaws...

Please sign up or login with your details

Forgot password? Click here to reset