Towards resilient machine learning for ransomware detection

12/21/2018 ∙ by Li Chen, et al. ∙ 0

There has been a surge of interest in using machine learning (ML) to automatically detect malware through their dynamic behaviors. These approaches have achieved significant improvement in detection rates and lower false positive rates at large scale compared with traditional malware analysis methods. ML in threat detection has demonstrated to be a good cop to guard platform security. However it is imperative to evaluate - is ML-powered security resilient enough? In this paper, we juxtapose the resiliency and trustworthiness of ML algorithms for security, via a case study of evaluating the resiliency of ransomware detection via the generative adversarial network (GAN). In this case study, we propose to use GAN to automatically produce dynamic features that exhibit generalized malicious behaviors that can reduce the efficacy of black-box ransomware classifiers. We examine the quality of the GAN-generated samples by comparing the statistical similarity of these samples to real ransomware and benign software. Further we investigate the latent subspace where the GAN-generated samples lie and explore reasons why such samples cause a certain class of ransomware classifiers to degrade in performance. Our focus is to emphasize necessary defense improvement in ML-based approaches for ransomware detection before deployment in the wild. Our results and discoveries should pose relevant questions for defenders such as how ML models can be made more resilient for robust enforcement of security objectives.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Ransomware is a type of malicious software (malware), which hijacks and blocks victim’s data or machine until a monetary ransom is paid. Its life cycle consists of six phases [mcafeelabs2016]: i). Distribution: the ransomware arrives at victim’s machine by an email attachment, a drive-by download or a code dropper; ii) Infection: the ransomware installs itself to survive a reboot and disables shadow copies or anti-virus processes; iii). Communication: the ransomware contacts its Command and Control (C&C) server for the encryption key; iv) Preparation: the ransomware scans the user’s files, usually pdf, docx, jpg files; v). Encryption: the ransomware encrypts the selected user files; and finally vi) Extortion: a “ransom note”, asking for payment, is displayed to the user. After the ransom is paid, instructions to receive the decryption key will be sent to the user.

There are two main categories of ransomware based on attack approaches: the locker-ransomware and the crypto-ransomware [cabaj2018software, gomez2018r]. The locker-ransomware locks the victim’s computer without the encryption. The crypto-ransomware encrypts victim’s files which are very difficult to revert. The quick solution is to pay the extortion and hope the given key can truly decrypt the data. Thus the crypto-ransomware remains a notorious security issue today. In our case study, we focus on cryto-ransomware.

The popularity of Internet and untraceable payment methods and availability of software development tools makes ransomware an feasible weapon for remote adversaries [al2018ransomware]. In recent years, ransomware has posed increasingly major threats. Since 2017, ransomware attacks have increased over 59% yearly with 35% growth in Q4 2017 alone. Although the trend of devising new ransomware has declined in 2018, the occurrence of ransomware attacks is still rising [mcafee032018, mcafee092018].

Static analysis is a fast and safe technique to detect ransomware prior to execution. However, it is ineffective against obfuscated, packed or metamorphic malwares, which can easily bypass a binary analysis or signature-based detection. On the other hand, dynamic analysis can reveal true malicious intentions by executing malware in a contained environment such as a sandbox. Recent research have found behavior analysis via analyzing API calls, registry accesses, I/O activities or network traffic can be effective for ransomware detection[cabaj2018software, hampton2018ransomware, scaife2016cryptolock, kharraz2016unveil, morato2018ransomware, gomez2018r, continella2016shieldfs].

Faced with a tsunami of malware attacks, the security industry are employing machine learning (ML) to automatically detect threats and enhance platform security. Their confidence in ML is not ungrounded. ML algorithms have demonstrated state-of-the-art performance in the field of Computer Vision (CV), Natural language Processing (NLP), Automatic Speech Recognition (ASR) and pushed the boundary of what once thought to be impossible to achieve. The success of ML has generated huge interest of applying it to platform security domains such as automated malware detection and seen the promising value of ML for better security

[rieck2011automatic, chen2018henet, shamili2010malware, chen2017semi]

. Particularly for ransomware detection, ML algorithms such as naive Bayes, support vector machine, random forest, logistic regression have shown high true positive rate and low false positive rate

[narudin2016evaluation, alhawi2018leveraging, ransomware_rsa2017]

. Shallow or deep neural networks also demonstrated high effectiveness at ransomware detection

[vinayakumar2017evaluating, aragorn2016deep, chen2017deep].

Recent research take advantage of opaqueness of NN algorithms and generate subliminal perturbed input examples which have shown to evade ML based detection. These types of emerging attacks, where an adversary can control the decision of the ML model by “targeted” small input perturbations, expose a broad attack surface. Although most of the Adversarial Machine Learning (AML) publications

[carlini2017adversarial, biggio2018wild, dalvi2004adversarial, ilyas2018black, carlini2018audio] focus on misclassification on CV and ASR domains, the proliferation of adversarial examples are spreading to generate sophisticated adversarial malware, which can perform real-time evasive attack by camouflaging malicious behavior to a legitimate software while keeping maliciousness intact and fooling ML detection during run-time. For example, AVPASS[jungavpass-bh] is a fooling mechanism that generates potent variations of existing Android malware by querying and inferring features used by malware detection system. Several academic researchers and Anti-Virus (AV) companies have shown promise of ML based approaches to thwart ransomware attack on user systems [continella2016shieldfs, sgandurra2016automated].

The malicious use of ML motivates us to properly study the adversarial attack threat models and investigate the robustness and vulnerability of ML-powered security defense systems. In this paper, we present a case study on using deep learning to automatically bypass ML-powered dynamic ransomware detection systems. We propose a framework based on generative adversarial network

[goodfellow2014generative] to generate dynamic ransomware behaviors and a set of quality metrics to justify that the generated samples indeed persist maliciousness. We discover that most of the selected highly effective ransomware classifiers fail to detect the adversary-generated ransomware, indicating a broad attack surface for ML-powered security systems. We thoroughly examine the latent feature space to understand where the adversarial examples lie. We believe that our proposed framework is useful for the defender system to incorporate and minimize their detection algorithms’ blind spots. Our case study examines the roles of ML as both a good cop and a bad cop for platform security.

The goal of our paper is to provide a framework to understand the resiliency of ransomware detectors. We do not enable a true attack on user system. The ML models in this paper are developed in house based on our collected data. As demonstrated in this paper, we advocate that a defender should fortify their ML models for ransomware detection via adversarial studies.

Our contributions are summarized as follows:

  1. Although generative adversarial network (GAN) has been used to generate fake samples to resemble the true data distribution, our framework is the first one which studies ML resiliency via GAN to automatically generate dynamic ransomware behaviors. Although our experiments proved that ML models are highly effective in combating real-world ransomware threats and can achieve high classification accuracy up to 99% accuracy with extremely low false positive rate (FPR), our results show that the in-house ML models fail to detect the GAN-generated adversarial samples. To stabilize training and achieve convergence, we utilize data segmentation techniques and auxiliary conditional GAN architecture.

  2. We propose a set of quality metrics to validate the generated adversarial ransomware and demonstrate the GAN-generated samples via our framework maintain maliciousness verified by such metrics. Although our ML classifiers detect these adversarial samples as benign, our quality metrics validate the adversarial samples are statistically much closer to real ransomware samples.

  3. We emphasize that robustness against adversarial samples is an equally important metric in addition to accuracy, false positive rate, true positive rate, score to thoroughly evaluate ransomware detection scheme before deployment. In our experiment, only one of the seven models has the strongest resiliency on the GAN-generated samples, indicating a broad adversarial attack surface of ML algorithms. On the other hand, our experiments provide guidance for security practitioners to develop resilient ML algorithms proven to defend against adversarial attacks.

  4. We study the reasons why the highly effective models are susceptible by properly investigating in the latent feature space and provide understanding of the blind spots of these ML models. We present our learning to generate awareness to the security community that adversarial threat models need to be properly evaluated before deploying ML models to detect malware attacks.

The rest of the paper is organized as follows: Sec II briefly provides the background on ransomware analysis, generative adversarial network and adversarial machine learning. Sec. III describes the data collection architecture and feature mapping. Sec. IV presents our proposed framework and quality assessment procedure. Sec. V presents experimental results on our dataset.

Ii Background and Related Work

Ii-a Ransomware Detection

Cabaj et al. [cabaj2018software] use HTTP message sequences and content sizes to detect ransomware. Morato et al. [morato2018ransomware] analyzed file sharing traffic for ransomware early detection. Scaife et al. [scaife2016cryptolock] provide an early detection system by monitoring user data changes including the file entropy and similarity changes, the file type changes, file deletion and file type funneling. The honeyfiles-based R-Locker system is in [gomez2018r] to trap and block ransomware operations. When ransomware scans user’s file system and accesses pre-installed decoy files, the R-Locker service is triggered to apply countermeasures. The “Unveil” system introduced in [kharraz2016unveil] can detect crypto-ransomware via the I/O access patterns. A Windows kernel I/O driver is developed to collect I/O operations and buffer entropy. It provides an early detection capability on a zero-day ransomware. Continella et al. create ShieldFS [continella2016shieldfs]

, a custom kernel driver that collects and performs analysis of low-level file-system activity to classify ransomware activity at runtime using a multi-tier hierarchical decision tree based process monitoring model. ShieldFS also integrates file back-up to its ransomware detection system so it can able to recover files from a trusted secure storage after confirming malicious activity. Sgandurra et al.

[SgandurraMML16] proposed "EldeRan" which dynamically analyzes Windows API calls, Registry key operations, File system operations, directory operations etc. in a sandboxed environment, selects relevant features and finally applies a logistic regression classifier to determine whether an application is "ransomware" or "benignware".

Ii-B Adversarial Machine Learning

The first adversarial machine learning attack is used against spam filtering by generating adversarial text without affecting content readability [Dalvi1014066]. The topic got significant attention in the security community when Szegady et al. [SzegedyZSBEGF13] fool a DNN based image recognition classifier by adding low-intensity perturbations to the input image which looks indistinguishable to human eyes. Adversarial attacks on the computer vision (CV) receive the most attention, where intentionally adding small human imperceptible perturbations to the original images has shown to drastically alter the ML boundary decisions [DBLP:journals/corr/GoodfellowSS14], [DBLP:journals/corr/KurakinGB16], [DBLP:conf/eurosp/PapernotMJFCS16], [DBLP:conf/cvpr/Moosavi-Dezfooli16], [DBLP:conf/sp/Carlini017]. Beyond CV, [DBLP:journals/corr/abs-1801-01944] generate adversarial speech which changes the output of Mozilla’s DeepSpeech: a speech-to-text transcription engine although perceptually sounding the same. Adversarial malware are created to bypass ML-based detection systems while keeping maliciousness of the software intact [jungavpass-bh].

Defense techniques including input pre-processing via JPEG compression [DBLP:conf/kdd/DasSCHLCKC18, das2017keeping], feature squeezing [DBLP:conf/ndss/Xu0Q18], novel model architecture using regularization [DBLP:journals/corr/abs-1803-06373], adversarial training [DBLP:journals/corr/MadryMSTV17], neural fingerprinting [DBLP:journals/corr/abs-1803-03870] have exhibited success to mitigate the proliferating adversarial machine learning attacks.

Ii-C Generative Adversarial Network

The first generative adversarial network (GAN) ever introduced is a fully connected neural network architecture for both the discriminator and the generator [goodfellow2014generative]. Ever since, abundant GAN variants are proposed. The Deep Convolutional GAN (DCGAN) [radford2015unsupervised]

proposes using strided convolutions instead of fully connected multi-layer perceptrons and feature normalization to stabilize training and dealing with the poor weight initialization problem. The Conditional GAN (CGAN)

[mirza2014conditional] adds conditional setting to the generator and the discriminator by making both neural networks class-conditional. It has advantages to better represent multi-modal data generation. The Laplacian Pyramid GAN (LPGAN) [denton2015deep]

produces high quality generated images and uses multiple generators and discriminators in its architecture. It downsamples the input images, and during backpropagation, injects noise generated by a conditional GAN and then upsamples the images. Auxillary Classifier GAN (ACGAN)

[odena2016conditional] improves the training of GAN by adding more structure to the GAN’s latent space along with a specialized cost function. Wasserstein GAN (WGAN) [arjovsky2017wasserstein]

uses Wasserstein distance as the loss function to efficiently approximates the Earth Mover distance and significantly reduces the mode dropping phenomenon.

Iii Ransomware Data Description

Iii-a Data Collection and Description

In our analysis, the ransomware samples are downloaded from VirusTotal, where we collect recent submitted ransomware around late 2017 based on tags from Microsoft and Kaspersky. The samples are executed in a home-grown bare-metal sandbox system as seen in Figure  1 and the dynamic behaviors are collected via the .Net framework FileSystemWatcher (FSW) API. The callback functions bound with FSW are triggered for all file I/O operations. The low-level I/O activity patterns are collected and the normalized Shannon entropy of the targeted file is calculated [scaife2016cryptolock]. To catch evasive ransomware, a user activity simulation program is executed to emulate mouse clicks and key strokes. To mimic an active desktop environment, a Notepad and Office Word applications are launched before and during ransomware sample execution. The clean-ware dataset is collected manually from installing and executing around a hundred of applications from various categories such as office suite, browsers, file compression applications etc. The idle I/O activities of benign Windows system are collected for a few months from regular backups, updates, anti-virus applications and so on.

Figure 1: A diagram of sandbox system. The Sandbox will execute a binary downloaded from the Control server. The execution log is uploaded to the Data storage. Power Control can shut down Sandbox if needed.

Each sandbox robot, as seen in Figure  1, is pre-installed with several user files such as Windows Office, text or multimedia files. These files are designed to be the target of ransomware and used as decoy files to filter active ransomware samples. If these files are modified during execution, then this sample is assumed to be a “crypto-”ransomware and then collected to the malicious dataset. All behavior data are uploaded to Resilient ML platform [mlplatform], an open source project for data analysis. The names of the decoy files are appended with time stamps before ransomware execution, so each sample will see the same set of user files but with different file names.

Iii-B Feature Mapping

The collected execution log via FSW contains time stamp, event name, targeted file name and file entropy, as seen in Figure 2

. We attempt the least effort of feature processing by mapping the event combined with entropy change. The four main file actions are file delete, file create, file rename and file change. The entropy level is combined with the event of file change. Hence each execution log is represented by a sequence of events. We set the length for each sample to be 3000, so that the shorter length samples will be padded with zeros towards the end to match the dimension. Table

I shows the feature mapping.

Figure 2: A screen shot of dynamic execution log collected using FileSystemWatcher (FSW).
Events Feature encoding
Padding 0
File deleted 1
File content changed and entropy 2
File content changed and entropy 3
File content changed and entropy 4
File created 5
File content changed and entropy 6
File renamed 7
File content changed and entropy 8
File content changed and entropy 9
Table I: Feature mapping. We attempt the least effort of feature processing and categorize the events into 8 categories. We used 0 to pad the events so they are of the same length.

Iv Synthesizing dynamic features via GAN

GANs are mostly used in computer vision to generate images that seem real to the human eyes. Because they are typically used in the vision domain, one can terminate the training when the generated images look like the real images. The inputs in our case study, however, are dynamic execution logs, so it is not practical to stop training GAN by merely visualizing the generated samples. Furthermore when we directly employ the typical training mechanism of GANs, mode collapsing issues constantly arise. The challenges of training an effective GAN to fool the ransomware classifier motivate us to propose a different GAN training scheme for faster convergence and better-quality sample generation.

The principle of our proposed GAN training scheme is to segment the dynamic execution logs and leverage transfer learning to accelerate training convergence. Each execution log is segmented into

subsequences and then converted 2-dimensional arrays. Then transfer learning is employed such that the parameters and neural network architectures are borrowed from existing and successfully convergent GANs used in the vision domain, while we still train from scratch on the fixed architecture.

Iv-a Threat Model

We assume that the adversary has knowledge to the training data, but no knowledge at all of the underlying ransomware classifiers. This is a realistic assumption since for malware detection, anti-virus vendors obtain their training samples from VirusTotal, which allows users to download binaries or hashes.

Iv-B Training Procedure

Our approach essentially consists of segmentation and reshaping as preprocessing, GAN training, quality assessment, concatenation and evaluation. An overview of our framework is seen in Figure 3.

Figure 3: Overview of our proposed framework using GAN to generate dynamic features possessing ransomware properties.

Iv-B1 Segmentation and reshaping as preprocessing

We observe that, in our initial experiments, GAN did not converge when trained on the entire logs. This motivates us to consider training a convergent GAN on log segments. After feature mapping, we divide each training execution log into sequences each of length . If the length of the execution log is not divisible by , the end of the last subsequence will be padded zero. Each subsequence is then reshaped into two-dimensional square arrays of .

We note that the convergence issue may be resolved through searching the space of neural network architectures and the parameters. However our preprocessing step enables transfer learning to borrow existing convergent GAN architectures, hence saving exhaustive search efforts while still achieving convergence.

Iv-B2 Training

The generative adversarial networks (GAN), first introduced in [goodfellow2014generative], are paired neural networks consisting of a generator and a discriminator, which act like two players to win a game. The generator produces samples from the generated distribution which is to be as close as the real data distribution . The discriminator classifies whether the samples are generated by or truly sampled from . The purpose of the generator is to fool the discriminator and the purpose of the discriminator is to separate the fake from the real. At the end of the training, the generator is supposedly and theoretically to maximize fooling the discriminator.

We train an auxiliary classifier generative adversarial network (ACGAN) on the segmented two-dimensional arrays processed from the execution logs. Denote each real sample as , where

is the space containing all the real segmented execution logs. The paired data are drawn from the joint distribution

, where are the class labels with being ransomware and being benign.

Denote each generated sample as , where is the space containing all fake samples and is drawn from the generated sample distribution

. Let random variable

denote the label for data source where means the data is real and means the data is fake. The entire data denoted by consist of both real and fake samples, i.e., .

We denote as the noise generated by the generator , which is a function . Given the data , the discriminator

calculates two probabilities: whether the data is real or fake

and the class label of the sample . The loss function of AC-GAN comes into two parts:




The generator is trained to maximize and the discriminator is trained to minimize . Adding the above auxillary classifier to the discriminator in AC-GAN stabilizes training.

Because our threat model assumes the adversary has no knowledge of the underlying classifier, the stopping criterion for training our proposed mechanism only relies on the discriminator loss. However in a white-box attack where the adversary has knowledge of the ransomware detector, the goal of the attacker is to cause the generated samples from the malicious class to be misclassified as benign. Hence we can include a third term, with respect to the ransomware detector, to the loss function as follows:


The stopping criterion for training is the loss of the discriminator. After training, we can generate both fake malicious samples and fake benign samples . From an attacker’s perspective, it is more desirable to generate malicious samples, bypass detection and increase false negative rate. Hence we focus on for subsequent analysis and experiments. Each generated sample is of size , so we flatten the sample to 1-dimensional segments of length and round the generated sample to the closest integer value. For abuse of notation, we denote this set as .

Figure 4: The ACGAN architecture for generating reshaped execution segments. Left table: the architecture of the generator, where the input is latent variable and the output is a generated 2-D execution log segment. Right table: the architecture of the discriminator, where the inputs are the 2-D execution log segments, the output is predicted as benign or malicious via the auxiliary classifier, and the output is predicted as real or fake.

Iv-C Quality Assessment on the Generated Malicious Samples

Unlike in computer vision where the quality of the generated samples can be evaluated by visual inspection, evaluating the quality on dynamic execution logs requires a quantifiable metric. We propose a sample-based quality metric , where for each sample


where , and . Here, denotes the cardinality, is the set of matched -grams between the sample and the malicious test set, is the set of matched -grams between the sample and the benign test set and is the set of matched -grams among the sample , the test malicious set and the test benign set. Passing the quality check means that the generated samples contain more unique malicious samples than the unique benign samples. Since the real test data was not used for training the ACGAN, the proposed metric evaluates the generalized malicious properties that may not be found from the training set.

For a generated set , we calculate the quality metrics for each sample and filter the samples whose quality metric is below a pre-specified threshold . Suppose we expect to generate malicious samples and samples have . Then we regenerate a smaller set of , and repeat the process until we obtain desired quality samples.

Similarly for the entire set , we propose a batch-based quality metric to statistically summarize the set of for all

. The summary statistics are minimum, first quartile, median, lower quaratile, minimum and outliers.

We summarize the quality assessment procedure in Algorithm 1.

Input: Generated set with and quality threshold
Output: ;
Step 1: Calculate .
Step 2: Remove samples with bad quality . Denote the set of bad samples by where .

Algorithm 1 Quality assessment procedure

Iv-D Log Generation and Evaluation

The number of ways to concatenate the generated segments from is approximately . In our experiment, since all the segments in pass quality assessment, we can randomly concatenate the individual segments. We note that for even stronger attacks, the attacker can optimize the concatenation based on some optimization objective, and this is one of our next research steps.

The generated malicious samples, after quality assessment in Sec IV-C, are fed into the ransomware classifier. The adversarial detection rate is defined as the number of correctly predicted adversarial samples divided by the total number of adversarial samples. From a defender’s perspective, we can use the adversarial detection rate as another metric to quantify how resilient the malware detector is against adversarial attacks.

Iv-E Summary of Proposed Methodology

In Algorithm 2, we summarize our framework of training ACGAN to generalize dynamic ransomware features and using a set of quality metrics to statistically evaluate the maliciousness of the generated samples.

Input: Desired number of generated malicious samples , quality threshold , training data
Step 1: Segmentation and dimension conversion.
Step 2: Train AC-GAN.
Step 3: Generate such that .
Step 4: Apply quality assessment procedure on as in Algorithm 1.
if  then
     Generate with . Repeat until all generated segments pass quality assessment.
end if
Step 5: Concatenation.
Step 6: Feed the logs into ransomware detectors.
Algorithm 2 Generate dynamic adversarial logs to bypass ransomware detector.

V Experiment Results

V-a Ransomware Classification on Real Data

Machine learning can be efficient, scalable and accurate at recognizing malicious attacks. We first demonstrate beneficial usage of machine learning to produce highly effective ransomware detection.

The training and testing ratio is set at , where the training set contains 1292 benign samples and 3736 malicious samples, and the test set contains 324 benign samples and 934 malicious samples. After feature mapping, each execution log is represented as a sequence of events, and the sequence length is set to be 3000, where shorter sequences are padded with zeros towards the end.

We consider several popular classifiers including Text-CNN[kim2014convolutional]

, XGBoost

[chen2016xgboost], linear discriminant analysis (LDA), Random Forest[breiman2001random], Naive Bayes[mccallum1998comparison]

, support vector machine with linear kernel (SVM-linear), and support vector machine with radial kernel (SVM-radial). For fair comparison, all classifiers are trained on the same sequences and no further feature extraction such as

-gram is performed prior to the classification algorithms. We note that the raw features are not the same as -gram modeling, which counts the occurrences of the events. We report the classification accuracy, false positive rate (FPR), true positive rate (TPR), -score[sokolova2006beyond] and area under the ROC curve (AUC)[bradley1997use] for all selected classifiers.

As seen in Table II

, Text-CNN achieves the highest accuracy at 0.989, low false positive rate at 0.03, highest true positive rate at 0.9989, highest F-score at 0.9796 and highest AUC at 0.9950 among all other selected classifiers. XGB performs second best with accuracy at 0.931 and lowest false positive rate at 0.023. All other classifiers either suffer from low accuracy or high false positive rate. However, we expect using

-gram feature extraction will greatly improve the other classifiers’ performance.

Due to Text-CNN’s superior performance, we naturally use it as a feature extractor via the last pooling layer and retrain all the other classifiers on the embedding extracted via Text-CNN. We observe significant improvement of other classifiers composed with Text-CNN, as seen in Table III.

Classifier Accuracy FPR TPR -score AUC
Text-CNN 0.9890 0.030 0.9989 0.9796 0.9950
XGB 0.9308 0.023 0.7963 0.8557 0.8869
LDA 0.5048 0.574 0.7698 0.4077 0.6136
Random Forest 0.9348 0.213 0.9861 0.9497 0.8866
Naive Bayes 0.8704 0.250 0.9122 0.7488 0.8457
SVM-linear 0.4420 0.074 0.3587 0.4906 0.8130
SVM-radial 0.7417 0.997 0.9979 0.0061 0.9055
Table II: Classification performance on the test set. Text-CNN achieves the highest accuracy at 0.989 and low false positive rate at 0.03 among all selected classifiers. XGB performs second best with accuracy at 0.931 and lowest false positive rate at 0.023. All other classifiers either suffer from low accuracy or high false positive rate.
Figure 5: Class-conditional density plot for each dimension in Text-CNN feature space. Red denotes the malicious class and blue denotes the benign class. Text-CNN as a feature extractor helps separate the samples from two classes, as indicated by the density plots. The features extracted from Text-CNN are in .
Classifier Accuracy FPR TPR F-score AUC
XGB Text-CNN 0.9841 0.0032 0.9475 0.9685 0.9722
LDA Text-CNN 0.9865 0.0494 0.9989 0.9731 0.9977
Random Forest Text-CNN 0.9833 0.0556 0.9968 0.9497 0.9706
Naive Bayes Text-CNN 0.9666 0.1111 0.9936 0.9320 0.9906
SVM-linear Text-CNN 0.9881 0.0432 0.9989 0.9764 0.9974
SVM-radial Text-CNN 0.9897 0.0228 0.9957 0.9797 0.9993
Table III: Classification results on the test set. All the classical classifiers performance improve significantly using Text CNN as a feature extractor.
Figure 6: ROC curves of XGB, LDA, SVM compared with XGB Text-CNN, LDA Text-CNN and SVM Text-CNN. When using Text-CNN as a feature extractor and retraining XGB, LDA, SVM in the Text-CNN embedding subspace, we observe that all the composed classifiers possess significantly higher classification efficacy measured by AUC, F-score, accuracy false positive rate and true positive rate.

It is only worthwhile to evaluate the resiliency of a highly effective ransomware classifier. As seen in our experiment results, Text-CNN, whether as a classifier on its own or as a feature extractor, is most likely to be selected by a security defender. Although knowledge of the defender’s ransomware classifier is not needed by our analysis methodology, we evaluate the adversarial detection rate against Text-CNN based classifiers.

V-B Generate Adversarial Segments

We follow the steps in Section IV-B2 to train an AC-GAN [odena2016conditional]

, where we set the batch size to be 100, the latent dimension to be 100, and the training is stopped at the 80-th epoch. After training, we obtain 5029 segments from the malicious class

. We round the segments to the nearest integer and denote this set as .

V-C Quality Assessment

A successful evasion means the generated malicious samples not only fool ransomware classifier, but also persists maliciousness based on certain metrics. Following Section IV-C, we compute the quality metric of each GAN-generated sample for -grams with . Figure 7 shows the quality metric in -axis against each generated segment in -axis for 4-, 5-, 6-grams. We set the quality threshold to be , which means a qualified generated segment with statistically measured maliciousness would need to match over 50% of the unique malicious -grams than the unique benign -grams.

Figure 7: Quality metric for 4-,5-,6-grams. All the generated segments have , where and . Hence the generated segments have minimum of almost twice the unique malicious signatures than the unique benign signatures for 4-,5-,6-grams.

We also plot the batch-based quality metric for -grams, as represented in boxplots in Figure 8. As shown in the boxplots, all the generated segments are statistically much closer to the real malicious class with and .

Figure 8: Boxplots of to evaluate the generated batch quality. All the generated segments have , with for all -grams.

All the generated and qualified segments are concatenated randomly to produce 1257 execution logs.

V-D Evasion

The highly performing ransomware detectors Text-CNN, XGB Text-CNN, LDA Text-CNN, Random forest Text-CNN, Naive Bayes Text-CNN, SVM-linear Text-CNN, SVM-radial Text-CNN are applied on the adversary-generated logs. We report the number of detected samples and the detection rate in Table IV.

Most of the classifiers significantly degrade in detection performance, where Text-CNN, LDA Text-CNN, Naive Bayes Text-CNN, SVM-linear Text-CNN fail to detect any generated malicious samples, while XGB Text-CNN detects 12.73% correctly and Random forest Text-CNN detects 36.35% correctly. The most robust classifier turns out to be SVM-radial Text-CNN in this experiment with 100% detection rate. This can be due to its nonlinear boundary in the Text-CNN latent feature space. However only one classifier out of all seven highly effective classifiers is resilient to our bypass scheme. Our adversarial detection result clearly indicates that this is a potential vulnerability for ML-based ransomware detection systems.

Classifier No. detected Detection rate (%)
Text-CNN 0 0
XGB Text-CNN 16 12.73
LDA Text-CNN 0 0
Random forest Text-CNN 457 36.35
Naive Bayes Text-CNN 0 0
SVM-linear Text-CNN 0 0
SVM-radial Text-CNN 1257 100%
Table IV: Adversarial detection rate on the generated malicious samples. Six of the seven highly effective classifiers degrade severely in performance and only one classifier persists resiliency against attacks. This quantifies the attack surface for these ML-based ransomware detection algorithms. The non-linear boundary of SVM-radial Text-CNN effectively detects the adversarial samples.

V-E Latent Feature Space Investigation

We investigate why most of the highly effective classifiers fail to predict the adversarially generated samples correctly. We use the last pooling layer from Text-CNN as a feature extractor and will refer to the space of features extracted by Text-CNN as the latent feature subspace. The classifiers that achieve effective and competitive classification performance are XGB, LDA, Random Forest, Naive Bayes and SVM trained in the latent feature subspace. Text-CNN the classifier itself has linear boundaries via the fully connected layer in the latent feature subspace. Hence one natural investigation starts at how the generated samples and the real samples relate in the latent feature subspace induced by Text-CNN, in comparison with their relationship in the original feature space, consisting of the raw execution logs.

Represented in 2-D visualization, Figure 9 shows that the generated samples, in dark red, lie close to a linear boundary but much closer to the real benign samples in the Text-CNN latent feature subspace. However as shown in Section V, most of the generated samples match more than twice of the unique ransomware signatures than the unique benign signatures. This motivates us to explore the distance between the real malicious samples and real benign samples, as well as between the generated samples and the real samples in both the latent feature subspace and the original feature space.

Denote the latent features of the generated malicious logs as , the latent features of the training malicious logs as and the latent features of the training benign logs as . Similarly, for the test data, the latent malicious and benign features are denoted as and respectively.

We plot the density of the -distances between test malicious data and training data, both of which are real samples. The left figure in Figure 10 shows, in the original feature space, the density of the distance between the malicious test logs and the training malicious logs in red and the density of the distance between the malicious test logs and the training benign logs in blue. The dashed red and blue vertical lines represent the means of and respectively. On average, the malicious test logs are closer to the training malicious logs than to the training benign logs. However in the original data space, the distributions of distances are not very well-separated and this is also reflected in the algorithm performance on the original data space as shown in Table II.

The right figure in Figure 10 plots the density of the distance between and in red and the density of the distance between and in blue. The dashed red and blue vertical lines represent the means of and respectively. is much closer to than to . The distances are consistent across original feature space and the latent feature subspace. This observation is expected since the malicious samples should be close together in either feature space.

Figure 9: Visualization of the Text-CNN extracted features for (left) PC-dimension 1 vs PC-dimension; (middle) PC-dimension 1 vs PC-dimension 3; (right) PC-dimension 2 vs PC-dimension 3. The generated malicious samples are colored in dark red, and lie closer to the benign set in Text-CNN subspace. We draw the 95% data eclipse around the scattered points.

Next we understand whether the observed phenomenon extends to the generated samples and real samples. The left figure in Figure 11 plots, in the original feature space, the density of the -distance between the generated logs and the training malicious logs in red and the density of the distance between the generated logs and the training benign logs in blue. The dashed red and blue vertical lines represent the means of and respectively. The generated malicious logs are much closer to the real malicious logs than to the real benign logs in the original feature space.

The right figure in Figure 11 plots, in the latent feature space, the density of the -distance between and in red and the density of the distance between and in blue. The dashed red and blue vertical lines represent the means of and respectively. is much closer to than to . Figure 11 shows that in the Text-CNN feature subspace, the generated logs are closer to the benign logs, while in the original feature space, the generated logs are closer to the malicious logs. This phenomenon indicates that the generated adversarial samples lie in the blind spot of the Text-CNN algorithm.

Figure 10: Density plot of the distances between real benign and real malicious logs in both original feature space and Text-CNN latent feature space.
Figure 11: Density plot of the distances between generated logs and real logs in both original feature space and Text-CNN latent feature space.

Vi Discussion

In this paper, we describe a framework via generative adversarial network to synthesize dynamic ransomware samples and propose a set of quality metrics via statistical similarity to quantify the maliciousness of the GAN-generated samples. We demonstrate in our experiments that six of the seven highly effective ransomware classifiers fail to detect most of the GAN-generated samples.

Our proposed framework should be utilized as a defensive capability for developing a resilient model for detecting ransomware in the field. As described in Section IV-D, a defender can use the adversarial detection rate as a metric to quantify the resilience of the ransomware detector against adversarial attacks. The defender can use the GAN-generated samples as part of the training procedure to update the defender’s classifier. Our proposed quality assessment approach can be leveraged even when the model is deployed and is in use in the field to track the changes in distance between generated and real samples. These robustness mechanisms must be considered as an integral part of an adversary-resilient malware classifier.

Our case study for evaluating a broad range of ransomware classifiers also demonstrates the pitfalls in selecting classifiers based on high accuracy and low false-positives which is typical today in malware detection. After a deeper analysis of generating quality adversarial samples, the most robust classifier is verified to be SVM-radialText-CNN in our experiment. This analysis may form the basis of selecting multi-classifier ensemble-based approaches to act as a defense-in-depth against adversarial probing attacks once the ransomware classifiers are deployed in the field. In our specific case study, a weighted score between the XGBText-CNN classifier and the SVM-radialText-CNN classifier gives the defender much more coverage in the space of execution logs for ransomware.

Lastly, it is important to note that our framework is still useful to enforce the resiliency of the ransomware detection model even when the model is deployed on a platform using software and hardware-based Trusted Execution Environments (TEEs) that protect the run-time confidentiality and integrity of the classifier(s) while in-use - providing the defender with an additional tool to continue to enforce the security objectives consistently even post the training stages.