Parallelizing Binary Code Analysis

01/28/2020
by   Xiaozhu Meng, et al.
0

Binary code analysis is widely used to assess an a program's correctness, performance, and provenance. Binary analysis applications often construct control flow graphs, analyze data flow, and use debugging information to understand how machine code relates to source lines, inlined functions, and data types. To date, binary analysis has been single-threaded, which is too slow for applications such as performance analysis and software forensics, where it is becoming common to analyze binaries that are gigabytes in size and in large batches that contain thousands of binaries. This paper describes our design and implementation for accelerating the Dyninst binary analysis infrastructure with multithreading. Our goal is to parallelize a commonly used set of binary analysis capabilities, making it easier to parallelize different binary analysis applications. New parallel binary analysis capabilities include constructing control flow graphs, analyzing loop nesting, parsing debugging information, and performing data flow analysis such as register liveness analysis and program slicing. We cover 146K lines of source code in Dyninst and its software dependencies. A systematic methodology to guide the parallelization is essential: we used data race detection tools to identify unsafe parallelism and employed performance profiling tools to pinpoint performance bottlenecks that merit attention. This methodology guided us to design a new parallel analysis for constructing control flow graphs and identify thread-safety issues widely spread in the codebase. We achieved as much as 25X speedup for constructing control flow graphs and as much as 14Xfor ingesting DWARF on 64 hardware threads. Binary analysis applications are significantly accelerated with the new parallel Dyninst: we achieve 8X for a performance analysis tool and 7X for a software forensic tool with 16 hardware threads.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

01/28/2020

Parallel Binary Code Analysis

Binary code analysis is widely used to assess a program's correctness, p...
05/03/2020

BCFA: Bespoke Control Flow Analysis for CFA at Scale

Many data-driven software engineering tasks such as discovering programm...
08/26/2017

Fast and Precise Type Checking for JavaScript

In this paper we present the design and implementation of Flow, a fast a...
09/02/2020

CcNav: Understanding Compiler Optimizations in Binary Code

Program developers spend significant time on optimizing and tuning progr...
05/22/2017

Mira: A Framework for Static Performance Analysis

The performance model of an application can pro- vide understanding abou...
03/28/2020

liOS: Lifting iOS apps for fun and profit

Although iOS is the second most popular mobile operating system and is o...
08/05/2020

Interprocess Communication in FreeBSD 11: Performance Analysis

Interprocess communication, IPC, is one of the most fundamental function...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.