SoK: All You Ever Wanted to Know About x86/x64 Binary Disassembly But Were Afraid to Ask

07/28/2020
by   Chengbin Pang, et al.
0

Disassembly of binary code is hard, but necessary for improving the security of binary software. Over the past few decades, research in binary disassembly has produced many tools and frameworks, which have been made available to researchers and security professionals. These tools employ a variety of strategies that grant them different characteristics. The lack of systematization, however, impedes new research in the area and makes selecting the right tool hard, as we do not understand the strengths and weaknesses of existing tools. In this paper, we systematize binary disassembly through the study of nine popular, open-source tools. We couple the manual examination of their code bases with the most comprehensive experimental evaluation (thus far) using 3,788 binaries. Our study yields a comprehensive description and organization of strategies for disassembly, classifying them as either algorithm or else heuristic. Meanwhile, we measure and report the impact of individual algorithms on the results of each tool. We find that while principled algorithms are used by all tools, they still heavily rely on heuristics to increase code coverage. Depending on the heuristics used, different coverage-vs-correctness trade-offs come in play, leading to tools with different strengths and weaknesses. We envision that these findings will help users pick the right tool and assist researchers in improving binary disassembly.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 19

03/24/2022

Binary Lifter Evaluation

Binary rewriting gives software developers, consumers, attackers, and de...
07/02/2019

GTIRB: Intermediate Representation for Binaries

GTIRB is an intermediate representation for binary analysis and transfor...
04/18/2022

Automated Test Generation for REST APIs: No Time to Rest Yet

Modern web services routinely provide REST APIs for clients to access th...
10/02/2008

Optimizing Binary Code Produced by Valgrind (Project Report on Virtual Execution Environments Course - AVExe)

Valgrind is a widely used framework for dynamic binary instrumentation a...
09/22/2020

mage: Fluid Moves Between Code and Graphical Work in Computational Notebooks

We aim to increase the flexibility at which a data worker can choose the...
07/16/2012

MARFCAT: Transitioning to Binary and Larger Data Sets of SATE IV

We present a second iteration of a machine learning approach to static c...
07/05/2021

E-SC4R: Explaining Software Clustering for Remodularisation

Maintenance of existing software requires a large amount of time for com...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.