Leveraging Artificial Intelligence on Binary Code Comprehension

10/11/2022
by   Yifan Zhang, et al.
0

Understanding binary code is an essential but complex software engineering task for reverse engineering, malware analysis, and compiler optimization. Unlike source code, binary code has limited semantic information, which makes it challenging for human comprehension. At the same time, compiling source to binary code, or transpiling among different programming languages (PLs) can provide a way to introduce external knowledge into binary comprehension. We propose to develop Artificial Intelligence (AI) models that aid human comprehension of binary code. Specifically, we propose to incorporate domain knowledge from large corpora of source code (e.g., variable names, comments) to build AI models that capture a generalizable representation of binary code. Lastly, we will investigate metrics to assess the performance of models that apply to binary code by using human studies of comprehension.

READ FULL TEXT

page 1

page 2

page 3

research
10/11/2022

COMBO: Pre-Training Representations of Binary Code Using Contrastive Learning

Compiled software is delivered as executable binary code. Developers wri...
research
03/01/2021

Rethinking complexity for software code structures: A pioneering study on Linux kernel code repository

The recent progress of artificial intelligence(AI) has shown great poten...
research
05/13/2022

dewolf: Improving Decompilation by leveraging User Surveys

Analyzing third-party software such as malware or firmware is a crucial ...
research
03/09/2021

Finding Inlined Functions in Optimized Binaries

Much software, whether beneficent or malevolent, is distributed only as ...
research
04/07/2023

Revisiting Deep Learning for Variable Type Recovery

Compiled binary executables are often the only available artifact in rev...
research
07/22/2022

CARBON: A Counterfactual Reasoning based Framework for Neural Code Comprehension Debiasing

Previous studies have demonstrated that code intelligence models are sen...
research
05/06/2023

Revisiting Lightweight Compiler Provenance Recovery on ARM Binaries

A binary's behavior is greatly influenced by how the compiler builds its...

Please sign up or login with your details

Forgot password? Click here to reset