Adversarial Examples: Attacks on Machine Learning-based Malware Visualization Detection Methods

by   Xinbo Liu, et al.
Imperial College London

As the threat of malicious software (malware) becomes urgently serious, automatic malware detection techniques have received increasing attention recently, where the machine learning (ML)-based visualization detection plays a significant role.However, this leads to a fundamental problem whether such detection methods can be robust enough against various potential attacks.Even though ML algorithms show superiority to conventional ones in malware detection in terms of high efficiency and accuracy, this paper demonstrates that such ML-based malware detection methods are vulnerable to adversarial examples (AE) attacks.We propose the first AE-based attack framework, named Adversarial Texture Malware Perturbation Attacks (ATMPA), based on the gradient descent or L-norm optimization method.By introducing tiny perturbations on the transformed dataset, ML-based malware detection methods completely fail.The experimental results on the MS BIG malware dataset show that a small interference can reduce the detection rate of convolutional neural network (CNN), support vector machine (SVM) and random forest(RF)-based malware detectors down to 0 and the attack transferability can achieve up to 88.7 different ML-based detection methods.


Attacks on Visualization-Based Malware Detection: Balancing Effectiveness and Executability

With the rapid development of machine learning for image classification,...

HAPSSA: Holistic Approach to PDF Malware Detection Using Signal and Statistical Analysis

Malicious PDF documents present a serious threat to various security org...

Collaborative Information Sharing for ML-Based Threat Detection

Recently, coordinated attack campaigns started to become more widespread...

Exploring Adversarial Examples in Malware Detection

The Convolutional Neural Network (CNN) architecture is increasingly bein...

SMART: Semantic Malware Attribute Relevance Tagging

With the rapid proliferation and increased sophistication of malicious s...

Exploring Backdoor Poisoning Attacks Against Malware Classifiers

Current training pipelines for machine learning (ML) based malware class...

Adversarial Attacks against Windows PE Malware Detection: A Survey of the State-of-the-Art

The malware has been being one of the most damaging threats to computers...

I Introduction

With the rapid development of the Internet and smartphones, the number of malicious software has been growing unexpectedly in recent years. More than 3.24 million new malwares were detected in Android application market in 2016 [1]. In 2017, Google Play intercepted more than 700,000 malicious software applications with an annual growth of 70% [2]. These counterfeit applications use unwrapped content or Malware to impersonate legitimate applications and interfere with the normal software market. According to the recent report from AV-TEST, even though the popular ransomware accounts for less than 1% of the total malware, billions of dollars has been lost since 2016 [3].

Malware denotes a particular type of programs that perform malicious tasks and illegal controls on computer systems by breaking software processes to obtain the unauthorized access, interrupt normal operations and steal information on computers or mobile devices. There is a variety kinds of malwares including viruses, Trojans, worms, backdoors, rootkits, spyware, ransomware and panic software, etc. [4]

. Countermeasures against such malware threat can be classified into three cases: digital signatures

[5], static code analysis [6] and dynamic code analysis [7], to safeguard the digital world away from malware. For traditional digital signature methods, since the number of new signatures released every year grows exponentially [5], it is undesirable to use a digital signature-based method to determine each sample and detect the malicious behavior one by one. Static code analysis can detect malware in a complete coverage through the disassembling, but it usually suffers from the code obfuscation. In addition, executable files must be unpacked and decrypted before analysis which significantly limits the detection efficiency. Compared to static code analysis, dynamic code analysis is not required to unpack or decrypt the execution files in a virtual environment. However, it is time-intensive and resource-consuming. Scalability is another important issue to be considered [7]. More importantly, methods mentioned above are difficult to detect some certain malicious behaviors that are well camouflage or not satisfied by trigger conditions.

As the development of machine learning and visualization techniques, researchers borrow the visualization idea from the field of computer forensics and network monitoring and use it to classify malware [8]. By converting binary code into image data, the approach can not only visualize the feature information of various types of samples but also improve the detection speed of malicious software. The detection process is more simplified than conventional approaches. Optimizations for ML algorithms have further improved the development of malware visualization detection techniques [4, 9, 10], in terms of preventing zero-day attacks, non-destructive detection of malware without extracting preselected features, improving accuracy and robustness for detection and reducing time and memory consumption.

However, adversarial Examples (AEs) are special samples generated from the original dataset with tiny additional perturbations and are indistinguishable from the discriminative model so that erroneous detection results can be induced. AEs were originally proposed by Szegedy et al. [11] in 2014. Since the discriminative model is constructed by the dataset without AEs, the detection accuracy can be incredibly reduced due to AEs. A large number of AE-construction methods [12, 11, 13, 14, 15] and related AE-defence techniques [16, 17, 18, 19, 20] were proposed.

In this paper, we propose the first approach to attack different ML-based visualization malware detection methods based on AEs, named Adversarial Texture Malware Perturbation Attacking (ATMPA). We extract the features of the state-of-the-art approaches for AE generation and use the Rectifier in neural-network hidden layers to improve the robustness for AEs training. ATMPA is employed into three malware visualization techniques to verify the attack effectiveness and transferability. In the experimental evaluation, we use the malware dataset from the MS BIG database and then convert it to a set of binary texture grayscale images. AEs generated by training such grayscale images are used to attack three ML-based detectors, i.e. CNN, SVM and RF, and their wavelet combined versions. The experimental results demonstrate that all of ML-based malware detection methods are vulnerable to our ATMPA attack with the successful rate of 100%. When performing the transferability test among different ML-based detection methods, ATMPA can achieve attack transferability up to 88.7% and 74.1% on average.

The contributions of this paper are as follows:

  • The first attack framework for ML-based malware detectors by using AEs and demonstrating the security vulnerabilities of ML applications.

  • Analysis of various AE variants to summarize an optimal infringement for ML-based malware detectors.

  • Evaluation and comparison analysis for the proposed ATMPA in terms of the rate of feasibility and transferability for CNN, SVM and RF-based malware detectors.

  • Extracting the feature of AE-based attacks along with several countermeasure solutions.

The rest of this paper is organized as follows: Section II details the related work of malware visualization detection methods associated with the AE techniques. Three kinds of ML-based malware visualization detection algorithms and two popular AE crafting methods are introduced in Section III. Section IV describes the proposed Adversarial Texture Malware Perturbation Attacking (ATMPA) method, and elaborates the design flow, model structure and algorithm pseudocode of this method. We show the experimental results in Section V and analyze the parameters that generate the AE for malware and demonstrate the effectiveness of ATMPA. Moreover, we gives some qualitative defensive strategies against AE-based attack in Section VI. Finally, we make a conclusion and provide some future research directions in section  VII.

Ii Related work

Ii-a Background of AE

Adversarial Examples (AEs) work for crafting subtle perturbations on original datasets for image classification problems and are able to fool the state-of-the-art deep neural networks (DNN) with high probability. In general, transferability and robustness are two key features for AEs. The transferability refers to the degree to which a particular AE attack can be transferred to function as other types of AEs for different attacks. The robustness represents the ability to withstand or overcome AE attacks. Prior work can be classified into un-targeted and targeted AEs. Un-targeted AEs can cause the classifier to produce any incorrect output without specifying a predicted category, while targeted AEs cause the classifier to produce a specific incorrect output. In the past three years, various AE crafting methods were proposed, such as L-BFGS

[11], FGSM [12], JSMA [13], C&W’s attack [14], DeepFool [15] as well as Generative Adversarial Network (GAN) [21]. Among them, researches have shown that FGSM [12] and C&W’s attack methods [14] are the most popular choices for generating AEs [22, 23]. We hence focus on the both in this paper.

Ii-B Code analysis-based malware detection approach

Code analysis-based malware detection can be classified into static code analysis and dynamic code analysis. Static code analysis is used to analyze malware without executing the program. The detection patterns used in static analysis include byte-sequence -grams, string signature, control flow graph, syntactic library call and opcode (operational code) frequency distribution etc. [24]. For example, disassemblers (e.g. IDA Pro [25] and OllyDbg [26]) and memory dumper tools (e.g. LordPE [27] and OllyDump [28]) can usually be used to reverse compile executables to analyze the malware statically. On the other hand, analysing the behaviour of a malicious code being executed in a controlled environment such as virtual machine, simulator and emulator is called as dynamic code analysis [7]. Dynamic code analysis is more effective as it discloses the natural behaviour of malware. Nowadays, there exists several automated tools for dynamic analysis, such as CWSandbox [29], TTAnalyzer [30], Ether [31], Norman Sandbox [32] etc. However, it is time-consuming to read the analysis report generated by these tools to understand malware behavior.

Ii-C ML-based malware detection approach

In order to accelerate the malware detection process, ML-based visualization detection methods have received a great attention in recent years. This kind of method has been proposed for detecting and classifying unknown samples into either known malware families or underline those samples that exhibit unseen behavior. However, when meeting with the different sample types these ML-based methods have limited detection ability. To overcome this, ML-based visualization detection methods have been proposed to detect all types of malware. The indistinguishable malware can be easily detected by different detectors based on RF [33], SVM [4], CNN [9] and GAN [21], by virtue of the transformed binary grayscale images.

Ii-D AE-based attack

AEs are used to bypass the detection of malicious codes. Gross et al. [10] used AEs to interfere with the binary features of Android malware. They aimed to attack DNN-based detection for Android malware and retain malicious features in the Apps. But this kind of attack is only adapted to some Android malware samples processed by binary features. As for the end-to-end static code detection techniques using convolutional neural networks, F. Kreuk et al. [34] used AEs to extract the feature information from binary code and disguised malware as a benign sample by injecting a small sequence of bytes (payload) in the binary file. This method not only retains the original function of malware but also achieves the purpose of deceiving global binary malware detector. However, it is extremely sensitive to discrete input dataset such as executable bytes. Minor modifications to the bytes of the file may lead to significant changes in its functionality and validity. Hu et al. [35]

proposed a method of generating sequential AEs by improving the robustness of its adversarial pertaining process, which can be used to attack the sequential API features-based malware in RNN detection system. The drawbacks of this method are time-consuming and large overhead in the whole processing. Different from the existing studies using AEs to escape the malware detection, this paper proposes the first method using AEs to attack ML-based visualization detectors, named ATMPA, which is a new attack framework based on visual transformation that employs specifically designed adversarial noise to deceive deep learning detection tools and other ML-based detectors according to the transferability of AEs

[36]. The noise is tenuous so that it is not significantly noticed and we can obtain a high transferability rate between different detectors.

Iii Preliminary

Before presenting our methodology, some preliminary theory used in this paper are detailed in the following, including malware visualization and ML-based detection methods.

Iii-a Malware Visualization

Malware visualization methods [8, 37, 38, 4] transform the hard-to-identify code information into image data with certain feature information. A specific segment of code represents a definite type of malware. Lee et al. [39] firstly applied the visualization technology to accelerate the process of malware detection.

The visualisation process includes the following steps. To begin with, we collect the malware sample set that needs to be transformed and read samples as a vector of 8-bit unsigned integers. After that, we can organize the set of vectors into a two-dimensional array. The corresponding value in the array can be expressed as the gray value of the image in the range from 0 to 255, where 0 indicates black and 255 means white. According to the different characteristics and analytical requirements for different data sets, the width of these transformed images could be appropriately adjusted as needed (e.g., 32 for the file size below 10KB and 64 between 10KB to 30KB). Once the width of the image is fixed, we can change its corresponding height depending on the file size. Figure 1 illustrates the flowchart of malware transformation process with a binary file from the MS BIG database [40], where a common Trojan horse download program (Dontovo A) is converted to an 8-bit grayscale image with the width of 64. The distinctive image texture is used to describe various primitive binary fragments in malware image and each special section is elaborated in the transformed visualization grayscale image. As shown in Figure 1, code segments of .text, .rdata, .data and .rsrc corresponds to 8-bit grayscale images.

Fig. 1: Process of Malware visualization transformation

Visualization technology provides a new direction for the study of malware. ML-based detection methods is used combined with visualization to improve the efficiency and accuracy of malware detection [40].

Iii-B Machine Learning-based Detection Method

In the transformed malware image, the intensity, variance, skewness, kurtosis, and the average intensity value of each pixel are between 0 and 255. It is straight to extract feature information from the byteplot using Wavelet and Gabor-based methods, for example. After feature extraction, the data set will be used to construct the discriminative model. In the following, three popular ML algorithms are discussed, which are the case studies to be evaluated.

Iii-B1 Random forest

Random forests (RF) is an ensemble learning method for classification, regression and detection. RF constructs a multitude of decision trees at training time and produce the corresponding results according to the model classification and mean prediction of the individual trees

[41]. In this paper, RF training algorithm uses bootstrap aggregating and bagging for tree learners. Given a training set with responses , bagging repeatedly ( times) selects a random sample with replacement of the training set and fits trees respectively. For , we train a classification or regression tree as on . The predictions for unseen samples is shown below by averaging the predictions from all the individual regression trees on .

Alternatively, an estimate of the prediction uncertainty could be made as the prediction’s standard deviation from all the individual regression trees on

by taking the majority vote in the case of classification trees.

where the number of samples/trees () is a free parameter. Generally, hundreds to several thousand trees are used depending on the size and nature of training set. We use Cross-validation and out-of-bag error to obtain the appropriate number of trees () with a stable training and test.

Iii-B2 Support Vector Machine

Support Vector Machine (SVM) is a widely used machine learning algorithm [42, 43]

, which aims to fit a hyperplane to separate both classes in a dataset with the largest possible margin

. In the SVM-based linear classification method, the unknown sample() can obtain a better classification effect in the generated hyperplane by optimizing the weight vector() and threshold () in decision function. The expression of assigned labels () is as follows,

Similarly, if the difference between malware images is too large and therefore we cannot fit the dataset with a linear model, we can simulate the dataset as a non-linear data distribution. Nonlinear decision functions are able to map samples into a feature space with special properties by applying a nonlinear transformation, called Reproducing Kernel Space. Therefore, choosing an appropriate nonlinear kernel function is important for the classification and detection analysis. Compared with the primal solution (e.g. weight ), this method through solving the parameter can be used not only for solving nonlinear decision functions but also for dealing with the variant of dual SVM learning problem. For the specificity of the decision function, we use the to represent the imaged sample, where denotes an unknown sample that needs to be compared and denotes the selected test sample in training dataset. Constructing a data pair for the parallel dataset is commonly referred to as a support vector and its expression is:

Iii-B3 Convolutional Neural Networks

Artificial neural networks (ANNs) [44]

is a computing system inspired by the biological neural networks, which is based on a collection of connected units or nodes called artificial neurons and then organized in interconnected layers (analogous to biological neurons in the animal brain). The activation function is applied in each neuron to activate its input to produce a corresponding output. Starting from the model input, each layer of network produces an output which is used as an input by the next layer. ANNs with a single intermediate layer (called hidden layer) are qualified as

shallow neural networks and some with multiple hidden layers are called as Deep Neural Networks (DNN), such as Convolutional Neural Networks(CNN) [45]. The basic structure of CNN typically consists of convolutional layers, pooling layers, fully connected layers and normalization layers. Multiple hidden layers are used for extracting representations hierarchically from the input, and then output a representation or prediction relevant to the problem requirement. As visualization images, the feature information of malware is gradually enlarged by the extraction of multiple layers. CNN model can be viewed as a simple mathematical model defining a function , in which can be formalized as the composition of multidimensional and parameterized functions corresponding to the layer of network architecture. The simplified formula is shown as follows:

where a set of model parameters is learned during training. Each vector includes the weights for links to connect layer to layer

. Taking supervised learning as an example, parameter values can be estimated based on the prediction errors

through an input-output pair .

Iv Adversarial Texture Malware Perturbation Attack (ATMPA)

Feature extraction and data reduction are commonly used in numerical models, therefore malware may exist within the real world data samples due to the specificity and camouflage. Introducing AEs is of high possibility to produce the opposite result against the original results, which leads to severe security threat [23]. To ideally demonstrate such vulnerability, we propose an Adversarial Texture Malware Perturbation Attack (ATMPA) method in this paper. Our proposal is used to interfere with several ML-based visualization detectors. In the following, we first introduce the proposed ATMPA method and corresponding application schemes, then describe the generation of malware AEs based on the proposed ATMPA.

Iv-a Framework

The attack model involved in this paper is that attackers can interfere with malware images during the visualisation transformation without being identified by the original detectors, as shown in Figure (a)a. As there exist some corresponding mechanisms [39, 5, 46] between binary code and visualised images, the malware images can be converted back to the corresponding form of the binary code [39, 47]). AEs can be converted back to the corresponding binary code through the reverse transformation to achieve the attack effect. The framework of ATMPA consists of three functional modules: data transformation, adversarial pre-training and malware detection. Malware detection functions include AE crafting and perturbed detection. The overall ATMPA architecture is shown in Figure (a)a. If there is no attack in the process, i.e. no AE involved, the second module adversarial pre-training will not exist, as shown in Figure (b)b. Otherwise, the code segments of malware will be transformed to grayscale images in data transformation module. Malware images will send directly to ML-based malware detectors such as CNN, SVM and RF. The green detection arrow indicates the normal detection process in Figure (b)b.

(a) Architecture of the proposed ATMPA framework
(b) Architecture of the original Malware Detection Process
Fig. 2: The Architecture of the ML-based Malware visualization Detection Process

Considering the AE attacks, we introduced a new module, named adversarial pre-training. After a series of preprocessing including data transformation and normalization operations, the malware dataset required for generating AEs is processed in the AE pre-training module and the dataset used for test will be propagated to malware detection module, as shown in Figure (a)a. In the AE pre-training module, attackers can use deep learning methods, such as CNN or RNN to obtain an AE crafting model by training a known malware dataset (shown in the purple block of Figure (a)a). Then, taking advantages of the AE crafting model, attackers can perform the interference functioning as noise signal on the targeted sample. Based on the achieved AE crafting model and different AE attacking methods, such as FSGM and C&W’s attack, attackers can apply small but intentionally worst-case perturbations on the transformed malware images. It is difficult to observe such perturbations for other users.

When the original information has been obstructed, the targeted (or un-targeted) AEs coupled with original dataset are propagated to the malware detector. As original detectors are usually not able to resist AEs, the ML-based malware detectors will be induced to produce an incorrect result that is desired by the attacker, as illustrated in the red arrow in Figure (a)a. To evaluate the universal applicability of the ATMPA framework, we choose commonly used CNN, SVM and RF-based ML detectors.

Iv-B Training process for Visualization-based Malware Detector

Since our proposed ATMPA method is not designed for a single detector, we have designed different types of detectors based on CNN, SVM, and RF in the module of malware detection. There are two training processes in this module: one is to achieve the normal malware detectors and the other is to obtain the AEs.

Iv-B1 Pre-training process for malware detector

To obtain more accurate ML-based detectors, the pre-training process is usually performed with training samples to build a discriminant model. According to the discriminant model, the analyst can make a detection with the visualised malware images. As it shows in Figure (b)b, we can see that different detectors in Normal detector Process have been pre-trained with the transformed malware dataset. This paper uses CNN, (e.g. GoogLeNet), and SVM and RF algorithms to build corresponding malware detectors.

Iv-B2 Training process for Adversarial Pre-training

Besides, the proposed ATMPA method also uses GoogLeNet to craft malware AEs in the pre-training process. We use the binary indicator vector

to represent malware samples without any particular structural properties or interdependencies. In the AE-training network, we apply a feed-forward neural network and the rectifier is used as the activation function for each hidden neuron. Standard gradient descent and dropout are also used for training. For normalization of the output probabilities, we also employ a softmax layer, where the output is computed as:

where, is an activation function, denotes to the input to a neuron, is the weight vector in gradient descent and represents the corresponding typical values.

Therefore, during the processing of AE generation in our ATMPA method, there is no need for attackers to construct a new AE-training model. AEs can be achieved by using a similar CNN structure for maleware detector training, therefore the crafted AEs are difficult to detect and we can easily attack ML-based malware detectors.

Iv-C Crafting Adversarial Malware Examples

It is usually difficult for attackers to invade the malware detector itself. However, attackers are able to modify samples to disguise their original counterparts. AEs are created to make a subtle and hardly detectable change to a data sample in the form of . Despite the detector having correctly classified result in , the target detector would produce an incorrect result because of the perturbation from AE . In order to deceive the malware detector and induce it to present the wrong result (against benign label in attack mode), two popular AE generation methods (FGSM and C&W) are optimized and then be used in this paper.

Iv-C1 FGSM-based method

FGSM is a fast method to generate AEs [12] and selects a perturbation parameter by differentiating this cost function with respect to the input itself. Along the direction of the gradient’s sign at each pixel point, FGSM only performs one step gradient update. The perturbation can be expressed as:


where is a small scalar value that restricts the magnitude of the perturbation and represents the distortion between AEs and original sample. denotes the sign function, computes the gradient of the cost function around the current value of the model parameters . is the label of .

By using gradient-based method, we first compute the detector’s gradient with respect to input to estimate the variant direction. The perturbation in would change the output of . In the FGSM crafting method, we use the Eq. (1) to represent the perturbation information. We will also choose a better perturbation of input with maximal positive gradient into the target class , and bring it into the next fitting step. If the label belongs to a benign sample label, the crafted AE sample is a pseudo-benign AE sample.

Moreover, in order to better mislead the classification, we have exploited an index with FGSM by finding the maximum gradient in the changing of the target class, as

We repeat this process with index until either the pre-judgment result producing the misclassification, i.e. , or the index value reaching the threshold of . The smaller is, the smaller number of iterations is needed to generated AEs and the more efficient the FGSM method is.

Using the FGSM method as an example in Figure 3, we illustrate the process of crafting malware image AEs with subtle obstruction in details. The original malware sample can be recognized by the detector with 88.5% confidence. After the perturbation with the distortion () through FGSM method, the perturbed malware sample could induce the detector to output the Benign sample with 99.8% confidence. By adjusting the value of the interference parameter , subtle perturbation () will be introduced to the original samples along the direction of the gradient variant. But this subtle perturbation is difficult to be observed by other users. Algorithm 1 illustrates the pseudo-code of the way to generate AEs, along with the corresponding explanation. The perturbation will be controlled to guarantee that we will not cause a negative change in due to intermediate changes of the gradient.

Fig. 3: Process of Perturbed Visualized Malware
    //Data preprocessing with 
  while  and  do
     ,   //compute forward derivative
     if   then
        return  Failure
     end if
  end while
Algorithm 1 Crafting Adversarial Example for Malware image with FGSM-based method

Iv-C2 C&W’s attack-based method

Carlini and Wagner summarized the distance metrics used in the previous related work and proposed a new attack strategy with norm, named C&W’s Attack [14]. By optimising the penalty term and distance metrics, the three different norm attack have been proposed correspondingly, as the formula below:


where norm penalty ensures that the added perturbation is small. The same optimization procedure with different L-norm can achieve the similar attacks with a modified constraint .

Just for attack as a example, the perturbation is defined in terms of an auxiliary variable as

Then, to find a (the perturbation), we optimize as,


where is an objective function based on the hinge loss,

Here, represents the softmax function for class and mostly linear from the input to the adversarial, is the target class, and denotes a constant to control the confidence with which the misclassification occurs.

In the proposed ATMPA method, -based C&W’s attack method is also introduced to generate malware AEs, including the attack. According to the detector’s hierarchical structure in malware images, we modified the optimization formulation and optimized AE generating process in ATMPA method [48]. Taking attack for instance, we optimise the formula in Eq. 2 as


where the original single optimization item from the formula in Eq. 3 are divided into two parts as & in the formula 4. Here, and are expressed as and , denoting the corresponding loss function in classifier and detector. The and are chosen via binary search. By doing so, the proposed ATMPA method could be improved in the property of universality and expendability.

Algorithm 2 illustrates the pseudo-code of the optimisation to generate AEs with the C&W’s attack-based method in norm attack, along with the corresponding explanation.

    //Data preprocessing with 
  while  and  do
       //where, is the softmax function for class
     Optimize the with
       //where,  and  are chosen via binary search
     if   then
        return  Failure
     end if
  end while
Algorithm 2 Crafting Adversarial Example for Malware image with C&W’s attack-based method

V Experiment

Experimental evaluation is conducted in terms of the effectiveness and transferability. We firstly introduce the setup of the experiment, such as necessary routines for preprocessing the collected dataset. Three visualization detectors based on CNN, SVM and RF are used as case studies to evaluate the effectiveness of ATMPA. Furthermore, the transferability of the ATMPA is analyzed between different detection algorithms by using the same set of generated AEs. In general, the higher the transferability rate is, the more transferable the proposed attack framework can be effectively applied to other ML-based malware detectors. Cross-validation for statistical evaluation is also used to analyze the reliability of our proposed ATMPA. The detailed descriptions of each experiment are as follows.

V-a Dataset

To evaluate the performance of the proposed method, we use an open source malware dataset in Kaggle Microsoft Malware Classification Challenge (BIG 2015). It consists of 10,867 labeled malware samples with nine classes. To collect benign examples, we scrap all the valid executables on Windows 7 and Windows XP on the virtual machine. We select 1000 benign samples by using anti-virus vendors in VirusTotal intelligence search 111 The distribution of these samples is illustrated in Table I. Considering the robustness and expandability of the structure in artificial neural network, we adopt the state-of-the-art deep neural network, GoogleNet Inception V3 architecture222 for AE generation. Finally, the FGSM and C&W’s attack methods are used to generate AEs () with the perturbation (). In this paper, our constructed AEs are the type of targeted AEs, which are able to disguise the malware as a benign sample to deceive the detector. We called them pseudo-benign sample in this paper.

Types of Malware Number of Samples
Ramnit (R) 1534
Lollipop (L) 2470
Kelihos ver3 (K3) 2942
Vundo (V) 451
Simda (S) 41
Tracur (T) 685
Kelihos ver1 (K1) 386
Obfuscator.ACY(O) 1158
Gatak (G) 1011
Benign sample(B) 1000
TABLE I: The MS BIG dataset with malware class distribution and benign samples

V-B Attack Effectiveness

The effectiveness of ATMPA is evaluated by attacking three commonly used malware detectors based on CNN (GoogleNet ), SVM and RF algorithms. Through the method of 10-fold cross-validation, the original malware sample set is randomly partitioned into ten equal sized subsample sets. One set of subsamples is replaced by AE samples as the tested data to measure the successful rate (%) of attacks and the remaining nine sets of subsample are used as training and detection. The rate of 100% denotes that all of the AEs can successfully induce the detector while 0% indicates that AEs are all recognized. Different AEs can be generated by using the FGSM method with a slight adjustment in the distortion parameter and applying different norms-based C&W methods. The experimental results demonstrate that all three ML-based detectors are easily mislead, especially for the attack on CNN-based detectors. The effectiveness of attacks and comparison analysis are illustrated in details respectively.

Detection Methods Distortion() Successful rate(%)
CNN Basic CNN 0.4 100
Wavelet+CNN 0.5 100
SVM Basic SVM 0.5 100
Wavelet+SVM 0.6 99.8
RF Basic RF 0.4 100
Wavelet+RF 0.6 99.7
TABLE II: Attacking method with FGSM method
Detection Methods Distortion Successful rate(%)
CNN Basic CNN 0.23 0.30 0.36 100
Wavelet+CNN 0.34 0.41 0.40 100
SVM Basic SVM 0.51 0.43 0.40 100
Wavelet+SVM 0.69 0.65 0.32 99.7
RF Basic RF 0.49 0.43 0.39 100
Wavelet+RF 0.56 0.52 0.49 100
TABLE III: Attacking Result with C&W’s attack method
(a) Three basic learning methods
(b) Three feature-extracted combined methods
(c) All tested methods
Fig. 4: Comparison result of FGSM Attack in basic learning methods and feature-extracted combined method

V-B1 Cnn

To begin with, we use the pseudo-benign AEs to attack a CNN visualization detector built on GoogleNet Inception V3. Under the 10 iterations of the gradient descent direction, attackers just need to adjust the default distortion parameter around a default value, and then the CNN-based detector can be easily induced and produce the wrong results. Through the analysis of cross-validation, we find that if the value of the distortion parameter is adjusted to , the successful rate of attacks is 100%, as shown in Table II. It is clear to see that the successful rate is increased with the corresponding increase of distortion value () in Figure 4. CNN-based detection methods are generally considered as a robust method with high accuracy, particularly for dealing with visualization-based image classification problems. However, the experimental results demonstrate that CNN-based applications is vulnerable to AEs.

Furthermore, if we limit a small learning rate such as 0.05 to prevent the feature information of original sample from fast extraction, AEs generated from C&W’s attack could easily mislead the CNN-based detectors under 100 iterations of gradient descent direction. In Table III, we list the successful rate with different distortion values in different L-norms distance to express the high successful rate in C&W’s attack. If attackers just change the interference parameters with 0.2 to 0.4 and the generated AEs can achieve a successful rate with 100% for all the attacks on the CNN-based detector.

V-B2 Svm

Because of the different mechanisms between SVM and CNN, SVM-based malware detector is attacked to verify the scalability of the proposed ATMPA. In this experiment, we contrast a SVM-based malware detector with the default parameters ( and ) in the open source code libsvm333 As it shows in Table II, when the value of distortion() has been set to 0.5, the attackers can obtain a successful rate of 100%. Comparing with other basic detect methods, even though the distortion value in SVM is the biggest, the SVM-based detector is still difficult to resist the attack from ATMPA. Meanwhile, according to the successful rate with the increased curve (blue dashed line) in Figure (a)a. It is clear that with the increasing of distortion in FGSM-based attack the successful rate could close to 100% gradually. Once the value of distortion() is over 0.4, the corresponding SVM-based detector will become invalid.

V-B3 Rf

Apart from CNN and SVM-based detectors, we also use the pseudo-benign AEs to attack the RF-based detector with the default parameter settings () from open source code444 By using statistical evaluation methods in 10-fold cross validation, the mean successful rate could be easy to reach 99.7% if the distortion value set to 0.6 in FGSM method. In addition, the similar successful rate is reached when C&W-based method is used, as it listed in Table III. If attackers set the distortion value over 0.36, e.g., all of the L-noram attacks (including ) can successfully induce the RF-based detector. Therefore, we can conclude that the proposed ATMPA method is reasonable to attack RF-based detectors.

V-B4 Comparisons Analysis

Comparison of attack effect for different ML algorithms are discussed in the following. At first, for CNN-based detector, the experimental results show that the robustness of anti-AE ability in CNN-based method is the most vulnerable to ATMPA. Both of the CNN and RF based detectors are not better than the other widely used SVM-based detection method. More interestingly, for SVM-based detector, it demonstrates a superior performance in resisting adversarial attack compared with other detection methods with higher distortion value (as listed in Table II). Some comparing analysis is made in terms of the curvature and gradient of these three curves in Figure (a)a, where the curve of SVM-based detector illustrates the best robustness. However, the ability of anti-induction is limited. As long as a slight adjustment in distortion scale in the FGSM and C&W’s method, such as increasing the strength of 10%, the ATMPA method can easily break out the defense from SVM-based detector. The mean value of distortion from cross-validation analysis is larger than others, it reflects that the ATMPA method can provide with a good model transferability. Namely, our proposed attacking method can be extended to liner-based surprised learning method. With respect to the RF-based detector, the ATMPA method demonstrates some very similar features when attacking different detectors between RF and CNN respectively, according to the comparison in Table II and Figure (a)a. On the one hand, there is a similar increasing trend for the curvature of the whole increased curve. On the other hand, when the attacking efficiency reaches the full attack (e.g. the successful rate is 100%), the distortion values () that are required to generate AEs are also very close, generally with the size of 0.5 in Table II. Therefore, the ability of defending ATMPA in RF-based malware detectors is very limited compared to the SVM counterpart.

V-C Attack Transferability

Transferability comparison is implemented in two aspects. From the perspective of similar types of detection methods, using the same AEs set from ATMPA, we attack the the similar detectors such as CNN and wavelet-combined CNN detectors. From the perspective different types of detection methods, we also use the same AEs set from ATMPA, but attack the the different detectors, such as CNN vs. SVM or CNN vs. wavelet-combined SVM detectors.

In this section, the wavelet transformation is used to strengthen the robustness on ML-based detection methods. Wavelet transformation, a classical technique of feature extraction and spatial compression [49], has been applied in many research fields and wavelet-combined ML methods have been widely used in the field of image classification. If the malware image is processed through the wavelet transformation, the sparse space of its own image will be compressed. The feature information will be extracted and the noise or other interference information will also be weakened, the overall performance will hence be improved. In this section, we use the same group of AEs to attack CNN, SVM and RF-based malware detectors optimized by using wavelet555 technique. Once the generated AEs are successful to induce the optimised detector, which represents that ATMPA leads to an equal effect on the optimized detector. Comparison results in terms of the transferability rate are shown in Table IV.

CNN SVM RF Wavelet-CNN Wavelet-SVM Wavelet-RF
CNN 100% 57.8% 86.3% 88.5% 54.2% 75.6%
SVM 64.5% 100% 75.5% 54.2% 85.7% 64.5%
RF 81.5% 45.6% 100% 67.9% 48.2% 88.7%
TABLE IV: Transferability rate in different detectors with ATMPA

V-C1 Wavelet-CNN

To attack the wavelet-combined CNN detector (Wavelet-CNN), ATMPA can defeat the detector with a strong transferability. By increasing the range of the distortion value (), Wavelet-CNN detector can be induced to produce the malware as a benign label. The error rate of the detector will rise from 85% to 100% through increasing the intensity of the distortion degree in FGSM method. Similarly, the wavelet-CNN detector will also produce an error rate of 100% when using C&W’s attack. Attackers only need to increase the distortion range from 0.15 to 0.18 on the different of L-norm distances. Generally, AEs can be successfully induced the CNN-based detectors in accordance with the attacker’s pre-set label , when the distortion value is more than 0.50. whether FGSM or C&W is used, the generated AEs () can achieve a 100% successful rate in terms of attacking the CNN-based and wavelet-combined detectors. The same AE set () will be applied to test the transferability by attacking different detectors. Experiment results show that the maximum value of transferability is 88.5% and the minimum value is 54%. CNN-based detection methods can hardly resist attacks from ATMPA.

V-C2 Wavelet SVM

Wavelet-combined SVM (Wavelet-SVM) detection is also used to measure the transferability of the ATMPA method. Wavelet-SVM malware detector is more resistant than Wavelet-CNN methods, but is still difficult to resist the attack from ATMPA. With a fixed distortion parameter () as and , wavelet-SVM shows the lowest success rate. The ability of resisting AEs has been greatly improved in the SVM-based detectors.

As for the comparison between Wavelet-SVM and SVM-based detectors, the Wavelet-SVM can resist AE attacks, even though such AEs can make the SVM-based detector completely fail. The transferability rate in different detectors of the same classified samples is shown in Table IV, such as 64.5% to CNN and 54.2% to wavelet-CNN.

When the distortion parameter is set to in FGSM, or the L-norm distance to 0.71 in C&W, the originally stable anti-AE’s SVM detector will be easily attacked as well. Therefore, even though the wavelet-SVM detector is more resistant to AEs, attackers can also fully induce these detectors by choosing appropriate perturbations.

V-C3 Wavelet RF

By comparing the RF and wavelet-RF detectors, the transferability of the proposed ATMPA method is robust. As shown in Table IV, the maximum transferability rate is 88.7%. We follow the calculation parameters as the above default settings (as ). Wavelet-RF detection method can express a certain robustness when resisting the attack from AEs, but the defensive ability is extremely limited. Taking the FGSM method as an example, when the distortion value set as , the successful rate of attacking the conventional RF detector can reach 100%. If the distortion parameter increases 15%, the attack on wavelet-RF will achieve the same successful rate of 100%. As for C&W’s attack methods, the attacking effect does not produce a very large fluctuation even with the introduction of wavelet-combined technique. Through increasing the distance values of by approximately 11.5% , setting the values as 0.45, 0.55, 0.65 respectively, the pseudo-benign AEs generated by the C&W’s attack method can achieve the same 100% successful attack rate. ATMPA constructed by C&W’s attack method will have a better transferability than FGSM version. The C&W-based ATMPA will be easier to attack RF and CNN-based detectors.

Vi Defensive strategy against malware AE

By analysing differences between different AEs and the original samples, the iterative processing of generating AEs and the algorithm structure oin different detectors, we summarise some qualitative defensive strategies to defend AE attack.

At first, we compare the mean distance and between the original inputs () and corresponding normal AE samples () and pseudo-benign AE samples . We find that both and have similar values, with the proportion of :. The normal AEs () generated for obstructing malware detectors perform similarly to pseudo-benign AEs (

) and its correspoding mathematical distribution is also close to the uniform distribution, which express the universality and expansibility in AE crafted process to confuse the analyst. Therefore, by the subtle perturbation which is hard to be observed it is easy for these AEs to induce the detection result from the detectors.

The process of AE generation fits along the optimal gradient of variants in the target category, therefore, whatever the target label is (even irrelevant), the detector will finally be induced to produce the result desired by attackers. Taking the FGSM method as an example, Figure 5 shows the variant trend of the absolute value () of gradient loss () in the objective function with the number of iterations () growing. For the target sample () of the first iteration, the gradient loss () of the sample () has the largest value (as ). With the subtle obstructions in each variable along the direction of gradient descent, the value of gradient loss () will rapidly reduce to zero, which shows a reduction in magnitude of two orders. The entire process of AE generation can be considered as learning stage for target sample (). Therefore, the malware visualization detection method based on level-by-level learning methods is difficult to resist the attack from AEs.

Fig. 5: The Loss value with iterative

Since AEs used in this paper are based on a CNN with hierarchical structure, some ML algorithms such as SVM have linear structures. That is why the SVM and wavelet-SVM based malware detectors can resist this kind of AE attack to some extent and RF and CNN-based detectors with similar hierarchical discriminant structure are vulnerable to AE attacks. We can utilize this feature to resist the AE attack with appropriate ML algorithms

As is shown in Table IV, when we use ATMPA to attack detectors with the similar algorithmic structures, their transferability rates are usually higher than those with different algorithmic structures. For example, from CNN to RF the transferability rate is 86.3%, but from RF to SVM the rate reduces to 45.6%. Therefore, AEs generated by using neural networks (NNs), for example, can not only make the conventional detectors fail but also attack other algorithms in similar structure with a high transferability. Therefore, it is difficult to defend AE attack for conventional NN-based malware detector.

However, some possible defensive strategies are proposed against the AT attack based on the above observations

  • According to the similarity among different AEs and distribution features among different perturbation values (), the detection accuracy will be improved if the dateset can be pre-preprocessed to expand the difference between samples.

  • According to the generation process of AEs including the variation characteristics of gradient loss () and iteration number (), the feature information of the normal samples could avoid to be learned by attackers through the process of expanding the dataset or improving their differences.

  • According to the characteristic of different algorithmic structures in malware detectors and AE generation process, it would be useful that defenders can build a security module to resist the obstruction from AEs in ML-based malware visualization detection. For example, the linear optimization detection based on Linear Discriminant Analysis (LDA), Logistic Regression and Hyperplane Separating method can be used.

Vii Conclusion

In this paper, we propose the first attack framework ATMPA that uses adversarial examples to challenge visualized malware detectors. The ATMPA uses the gradient-based FGSM method and L-norm based C&W’s attack method to generate AEs on the converted image dataset. The tested ML-based visualization detector are all unable to determine the malware correctly. Experimental results demonstrate that just very weak adversarial noise can achieve 100% successful attack rate for CNN, SVM and RF-based malware detectors. More importantly, the transferability rate when attacking different malware detectors is up to 88.7%.

In future, a new malware defense method using adversarial examples could be designed to simulate the variant of malware, so as to enhance the sample dataset which to be trained. In combination with adversarial training or generative adversarial network (GAN), it will increase the ability of defense against zero-day attacks. Secondly, adversarial examples are trained per classification problem, meaning that they can operate on the set of labels that have been trained. Switching to an alternative set of labels is likely to reduce their attacking effectiveness. Another interesting future research topic could be to develop ATMPA for these scenarios, e.g. for hierarchy-based labels (such as Animal-Horse-Zebra). Finally, the study and introduction of ATMPA can be used into other modalities, such as sound/speech processing to address users with recognition impairments.