BinImg2Vec: Augmenting Malware Binary Image Classification with Data2Vec

09/02/2022
by   Joon Sern Lee, et al.
0

Rapid digitalisation spurred by the Covid-19 pandemic has resulted in more cyber crime. Malware-as-a-service is now a booming business for cyber criminals. With the surge in malware activities, it is vital for cyber defenders to understand more about the malware samples they have at hand as such information can greatly influence their next course of actions during a breach. Recently, researchers have shown how malware family classification can be done by first converting malware binaries into grayscale images and then passing them through neural networks for classification. However, most work focus on studying the impact of different neural network architectures on classification performance. In the last year, researchers have shown that augmenting supervised learning with self-supervised learning can improve performance. Even more recently, Data2Vec was proposed as a modality agnostic self-supervised framework to train neural networks. In this paper, we present BinImg2Vec, a framework of training malware binary image classifiers that incorporates both self-supervised learning and supervised learning to produce a model that consistently outperforms one trained only via supervised learning. We were able to achieve a 4 0.5 framework produces embeddings that can be well clustered, facilitating model explanability.

READ FULL TEXT

page 3

page 5

research
08/15/2022

Self-Supervised Vision Transformers for Malware Detection

Malware detection plays a crucial role in cyber-security with the increa...
research
01/14/2023

Gated Self-supervised Learning For Improving Supervised Learning

In past research on self-supervised learning for image classification, t...
research
07/27/2023

Mixture of Self-Supervised Learning

Self-supervised learning is popular method because of its ability to lea...
research
12/03/2020

Using Cross-Loss Influence Functions to Explain Deep Network Representations

As machine learning is increasingly deployed in the real world, it is ev...
research
05/17/2022

A compartmental model for cyber-epidemics

In our more and more interconnected world, a specific risk is that of a ...
research
08/17/2021

RRLFSOR: An Efficient Self-Supervised Learning Strategy of Graph Convolutional Networks

To further improve the performance and the self-learning ability of GCNs...
research
08/05/2022

Modeling Self-Propagating Malware with Epidemiological Models

Self-propagating malware (SPM) has recently resulted in large financial ...

Please sign up or login with your details

Forgot password? Click here to reset