LmPa: Improving Decompilation by Synergy of Large Language Model and Program Analysis

06/05/2023
by   Xiangzhe Xu, et al.
0

Decompilation aims to recover the source code form of a binary executable. It has many applications in security and software engineering such as malware analysis, vulnerability detection and code reuse. A prominent challenge in decompilation is to recover variable names. We propose a novel method that leverages the synergy of large language model (LLM) and program analysis. Language models encode rich multi-modal knowledge, but its limited input size prevents providing sufficient global context for name recovery. We propose to divide the task to many LLM queries and use program analysis to correlate and propagate the query results, which in turn improves the performance of LLM by providing additional contextual information. Our results show that 75 recovered names are considered good by users and our technique outperforms the state-of-the-art technique by 16.5 respectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/02/2022

Pop Quiz! Can a Large Language Model Help With Reverse Engineering?

Large language models (such as OpenAI's Codex) have demonstrated impress...
research
06/08/2019

Recovering Variable Names for Minified Code with Usage Contexts

In modern Web technology, JavaScript (JS) code plays an important role. ...
research
03/23/2021

Variable Name Recovery in Decompiled Binary Code using Constrained Masked Language Modeling

Decompilation is the procedure of transforming binary programs into a hi...
research
06/28/2019

A Neural-based Program Decompiler

Reverse engineering of binary executables is a critical problem in the c...
research
09/10/2019

An Evalutation of Programming Language Models' performance on Software Defect Detection

This dissertation presents an evaluation of several language models on s...
research
08/13/2021

Augmenting Decompiler Output with Learned Variable Names and Types

A common tool used by security professionals for reverse-engineering bin...
research
08/17/2022

ASTRO: An AST-Assisted Approach for Generalizable Neural Clone Detection

Neural clone detection has attracted the attention of software engineeri...

Please sign up or login with your details

Forgot password? Click here to reset