SMART: Semantic Malware Attribute Relevance Tagging

05/15/2019
by   Felipe N. Ducau, et al.
0

With the rapid proliferation and increased sophistication of malicious software (malware), detection methods no longer rely only on manually generated signatures but have also incorporated more general approaches like Machine Learning (ML) detection. Although powerful for conviction of malicious artifacts, these methods do not produce any further information about the type of malware that has been detected. In this work, we address the information gap between ML and signature-based detection methods by introducing an ML-based tagging model that generates human interpretable semantic descriptions of malicious software (e.g. file-infector, coin-miner), and argue that for less prevalent malware campaigns these provide potentially more useful and flexible information than malware family names. For this, we first introduce a method for deriving high-level descriptions of malware files from an ensemble of vendor family names. Then we formalize the problem of malware description as a tagging problem and propose a joint embedding deep neural network architecture that can learn to characterize portable executable (PE) files based on static analysis, thus not requiring a dynamic trace to identify behaviors at deployment time. We empirically demonstrate that when evaluated against tags extracted from an ensemble of anti-virus detection names, the proposed tagging model correctly identifies more than 93.7 sample, at a deployable false positive rate (FPR) of 1 we show that when evaluating this model against ground truth tags derived from the results of dynamic analysis, it correctly predicts 93.5 a given sample. These results suggest that an ML tagging model can be effectively deployed alongside a detection model for malware description.

READ FULL TEXT

page 10

page 11

research
11/08/2021

HAPSSA: Holistic Approach to PDF Malware Detection Using Signal and Statistical Analysis

Malicious PDF documents present a serious threat to various security org...
research
06/18/2020

AVClass2: Massive Malware Tag Extraction from AV Labels

Tags can be used by malware repositories and analysis services to enable...
research
08/05/2018

Adversarial Examples: Attacks on Machine Learning-based Malware Visualization Detection Methods

As the threat of malicious software (malware) becomes urgently serious, ...
research
03/13/2019

ALOHA: Auxiliary Loss Optimization for Hypothesis Augmentation

Malware detection is a popular application of Machine Learning for Infor...
research
12/05/2022

Efficient Malware Analysis Using Metric Embeddings

In this paper, we explore the use of metric learning to embed Windows PE...
research
09/11/2017

A Planning Approach to Monitoring Behavior of Computer Programs

We describe a novel approach to monitoring high level behaviors using co...
research
02/07/2019

Dual-task agent for run-time classification and killing of malicious processes

Malicious software (malware) is one of the key vectors for cyber crimina...

Please sign up or login with your details

Forgot password? Click here to reset