A Comprehensive Study on Learning-Based PE Malware Family Classification Methods

10/29/2021
by   Yixuan Ma, et al.
0

Driven by the high profit, Portable Executable (PE) malware has been consistently evolving in terms of both volume and sophistication. PE malware family classification has gained great attention and a large number of approaches have been proposed. With the rapid development of machine learning techniques and the exciting results they achieved on various tasks, machine learning algorithms have also gained popularity in the PE malware family classification task. Three mainstream approaches that use learning based algorithms, as categorized by the input format the methods take, are image-based, binary-based and disassembly-based approaches. Although a large number of approaches are published, there is no consistent comparisons on those approaches, especially from the practical industry adoption perspective. Moreover, there is no comparison in the scenario of concept drift, which is a fact for the malware classification task due to the fast evolving nature of malware. In this work, we conduct a thorough empirical study on learning-based PE malware classification approaches on 4 different datasets and consistent experiment settings. Based on the experiment results and an interview with our industry partners, we find that (1) there is no individual class of methods that significantly outperforms the others; (2) All classes of methods show performance degradation on concept drift (by an average F1-score of 32.23 and (3) the prediction time and high memory consumption hinder existing approaches from being adopted for industry usage.

READ FULL TEXT
research
09/04/2023

MalwareDNA: Simultaneous Classification of Malware, Malware Families, and Novel Malware

Malware is one of the most dangerous and costly cyber threats to nationa...
research
11/22/2021

A Comparison of State-of-the-Art Techniques for Generating Adversarial Malware Binaries

We consider the problem of generating adversarial malware by a cyber-att...
research
10/08/2020

Transcending Transcend: Revisiting Malware Classification with Conformal Evaluation

Machine learning for malware classification shows encouraging results, b...
research
11/10/2017

Dynamic Analysis of Executables to Detect and Characterize Malware

It is needed to ensure the integrity of systems that process sensitive i...
research
09/18/2023

Efficient Concept Drift Handling for Batch Android Malware Detection Models

The rapidly evolving nature of Android apps poses a significant challeng...
research
03/07/2021

On Ensemble Learning

In this paper, we consider ensemble classifiers, that is, machine learni...
research
08/21/2018

MLPdf: An Effective Machine Learning Based Approach for PDF Malware Detection

Due to the popularity of portable document format (PDF) and increasing n...

Please sign up or login with your details

Forgot password? Click here to reset