Parallel Binary Code Analysis

01/28/2020
by   Xiaozhu Meng, et al.
0

Binary code analysis is widely used to assess a program's correctness, performance, and provenance. Binary analysis applications often construct control flow graphs, analyze data flow, and use debugging information to understand how machine code relates to source lines, inlined functions, and data types. To date, binary analysis has been single-threaded, which is too slow for applications such as performance analysis and software forensics, where it is becoming common to analyze binaries that are gigabytes in size and in large batches that contain thousands of binaries. This paper describes our design and implementation for accelerating the task of constructing control flow graphs (CFGs) from binaries with multithreading. Existing research focuses on addressing challenging code constructs encountered during constructing CFGs, including functions sharing code, jump table analysis, non-returning functions, and tail calls. However, existing analyses do not consider the complex interactions between concurrent analysis of shared code, making it difficult to extend existing serial algorithms to be parallel. A systematic methodology to guide the design of parallel algorithms is essential. We abstract the task of constructing CFGs as repeated applications of several core CFG operations regarding to creating functions, basic blocks, and edges. We then derive properties among CFG operations, including operation dependency, commutativity, monotonicity. These operation properties guide our design of a new parallel analysis for constructing CFGs. We achieved as much as 25× speedup for constructing CFGs on 64 hardware threads. Binary analysis applications are significantly accelerated with the new parallel analysis: we achieve 8× for a performance analysis tool and 7× for a software forensic tool with 16 hardware threads.

READ FULL TEXT
research
01/28/2020

Parallelizing Binary Code Analysis

Binary code analysis is widely used to assess an a program's correctness...
research
05/03/2020

BCFA: Bespoke Control Flow Analysis for CFA at Scale

Many data-driven software engineering tasks such as discovering programm...
research
11/21/2020

Revisiting Binary Code Similarity Analysis using Interpretable Feature Engineering and Lessons Learned

Binary code similarity analysis (BCSA) is widely used for diverse securi...
research
09/02/2020

CcNav: Understanding Compiler Optimizations in Binary Code

Program developers spend significant time on optimizing and tuning progr...
research
11/29/2019

Using performance analysis tools for parallel-in-time integrators – Does my time-parallel code do what I think it does?

While many ideas and proofs of concept for parallel-in-time integration ...
research
03/20/2016

Beyond Binary Computers: How To Implement Multi-Switch Computer Hardware and Software and; The Advantage of a Multi-Switched Computer

This paper explores the possibilities of using a computing methodology -...
research
02/12/2020

Performance analysis of Volna-OP2 – massively parallel code for tsunami modelling

The software package Volna-OP2 is a robust and efficient code capable of...

Please sign up or login with your details

Forgot password? Click here to reset