Parallelizing Binary Code Analysis

01/28/2020
by   Xiaozhu Meng, et al.
0

Binary code analysis is widely used to assess an a program's correctness, performance, and provenance. Binary analysis applications often construct control flow graphs, analyze data flow, and use debugging information to understand how machine code relates to source lines, inlined functions, and data types. To date, binary analysis has been single-threaded, which is too slow for applications such as performance analysis and software forensics, where it is becoming common to analyze binaries that are gigabytes in size and in large batches that contain thousands of binaries. This paper describes our design and implementation for accelerating the Dyninst binary analysis infrastructure with multithreading. Our goal is to parallelize a commonly used set of binary analysis capabilities, making it easier to parallelize different binary analysis applications. New parallel binary analysis capabilities include constructing control flow graphs, analyzing loop nesting, parsing debugging information, and performing data flow analysis such as register liveness analysis and program slicing. We cover 146K lines of source code in Dyninst and its software dependencies. A systematic methodology to guide the parallelization is essential: we used data race detection tools to identify unsafe parallelism and employed performance profiling tools to pinpoint performance bottlenecks that merit attention. This methodology guided us to design a new parallel analysis for constructing control flow graphs and identify thread-safety issues widely spread in the codebase. We achieved as much as 25X speedup for constructing control flow graphs and as much as 14Xfor ingesting DWARF on 64 hardware threads. Binary analysis applications are significantly accelerated with the new parallel Dyninst: we achieve 8X for a performance analysis tool and 7X for a software forensic tool with 16 hardware threads.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/28/2020

Parallel Binary Code Analysis

Binary code analysis is widely used to assess a program's correctness, p...
research
05/03/2020

BCFA: Bespoke Control Flow Analysis for CFA at Scale

Many data-driven software engineering tasks such as discovering programm...
research
08/26/2017

Fast and Precise Type Checking for JavaScript

In this paper we present the design and implementation of Flow, a fast a...
research
05/22/2017

Mira: A Framework for Static Performance Analysis

The performance model of an application can pro- vide understanding abou...
research
09/02/2020

CcNav: Understanding Compiler Optimizations in Binary Code

Program developers spend significant time on optimizing and tuning progr...
research
03/28/2020

liOS: Lifting iOS apps for fun and profit

Although iOS is the second most popular mobile operating system and is o...
research
08/06/2023

WASMixer: Binary Obfuscation for WebAssembly

WebAssembly (Wasm) is an emerging binary format that draws great attenti...

Please sign up or login with your details

Forgot password? Click here to reset