Learning Malware Representation based on Execution Sequences

12/16/2019
by   Yi-Ting Huang, et al.
0

Malware analysis has been extensively investigated as the number and types of malware has increased dramatically. However, most previous studies use end-to-end systems to detect whether a sample is malicious, or to identify its malware family. In this paper, we propose a neural network framework composed of an embedder, an encoder, and a filter to learn malware representations from characteristic execution sequences for malware family classification. The embedder uses BERT and Sent2Vec, state-of-the-art embedding modules, to capture relations within a single API call and among consecutive API calls in an execution trace. The encoder comprises gated recurrent units (GRU) to preserve the ordinal position of API calls and a self-attention mechanism for comparing intra-relations among different positions of API calls. The filter identifies representative API calls to build the malware representation. We conduct broad experiments to determine the influence of individual framework components. The results show that the proposed framework outperforms the baselines, and also demonstrates that considering Sent2Vec to learn complete API call embeddings and GRU to explicitly preserve ordinal information yields more information and thus significant improvements. Also, the proposed approach effectively classifies new malicious execution traces on the basis of similarities with previously collected families.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/13/2018

Towards Generic Deobfuscation of Windows API Calls

A common way to get insight into a malicious program's functionality is ...
research
05/06/2019

A Benchmark API Call Dataset for Windows PE Malware Classification

The use of operating system API calls is a promising task in the detecti...
research
11/20/2017

MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models (Extended Version)

As Android becomes increasingly popular, so does malware targeting it, t...
research
06/09/2023

Early Malware Detection and Next-Action Prediction

In this paper, we propose a framework for early-stage malware detection ...
research
12/25/2021

An Ensemble of Pre-trained Transformer Models For Imbalanced Multiclass Malware Classification

Classification of malware families is crucial for a comprehensive unders...
research
08/24/2019

Precise system-wide concatic malware unpacking

Run time packing is a common approach malware use to obfuscate their pay...
research
07/17/2019

Dynamic Malware Analysis with Feature Engineering and Feature Learning

Dynamic malware analysis executes the program in an isolated environment...

Please sign up or login with your details

Forgot password? Click here to reset