o-glasses: Visualizing x86 Code from Binary Using a 1d-CNN

06/14/2018
by   Yuhei Otsubo, et al.
0

Malicious document files used in targeted attacks often contain a small program called shellcode. It is often hard to prepare a runnable environment for dynamic analysis of these document files because they exploit specific vulnerabilities. In these cases, it is necessary to identify the position of the shellcode in each document file to analyze it. If the exploit code uses executable scripts such as JavaScript and Flash, it is not so hard to locate the shellcode. On the other hand, it is sometimes almost impossible to locate the shellcode when it does not contain any JavaScript or Flash but consists of native x86 code only. Binary fragment classification is often applied to visualize the location of regions of interest, and shellcode must contain at least a small fragment of x86 native code even if most of it is obfuscated, such as, a decoder for the obfuscated body of the shellcode. In this paper, we propose a novel method, o-glasses, to visualize the shellcode by recognizing the x86 native code using a specially designed one-dimensional convolutional neural network (1d-CNN). The fragment size needs to be as small as the minimum size of the x86 native code in the whole shellcode. Our results show that a 16-instruction-sequence (approximately 48 bytes on average) is sufficient for the code fragment visualization. Our method, o-glasses (1d-CNN), outperforms other methods in that it recognizes x86 native code with a surprisingly high F-measure rate (about 99.95

READ FULL TEXT

page 12

page 17

page 18

research
04/06/2020

Bringing GNU Emacs to Native Code

Emacs Lisp (Elisp) is the Lisp dialect used by the Emacs text editor fam...
research
06/09/2019

In Situ Cane Toad Recognition

Cane toads are invasive, toxic to native predators, compete with native ...
research
07/16/2019

Automated Deobfuscation of Android Native Binary Code

With the popularity of Android apps, different techniques have been prop...
research
02/21/2020

Real-Time Visualization in Non-Isotropic Geometries

Non-isotropic geometries are of interest to low-dimensional topologists,...
research
07/24/2020

Detecting malicious PDF using CNN

Malicious PDF files represent one of the biggest threats to computer sec...
research
11/10/2017

Not all bytes are equal: Neural byte sieve for fuzzing

Fuzzing is a popular dynamic program analysis technique used to find vul...
research
07/21/2017

Learning Program Component Order

Successful programs are written to be maintained. One aspect to this is ...

Please sign up or login with your details

Forgot password? Click here to reset