Throttling Malware Families in 2D

01/29/2019
by   Mohamed Nassar, et al.
0

Malicious software are categorized into families based on their static and dynamic characteristics, infection methods, and nature of threat. Visual exploration of malware instances and families in a low dimensional space helps in giving a first overview about dependencies and relationships among these instances, detecting their groups and isolating outliers. Furthermore, visual exploration of different sets of features is useful in assessing the quality of these sets to carry a valid abstract representation, which can be later used in classification and clustering algorithms to achieve a high accuracy. In this paper, we investigate one of the best dimensionality reduction techniques known as t-SNE to reduce the malware representation from a high dimensional space consisting of thousands of features to a low dimensional space. We experiment with different feature sets and depict malware clusters in 2-D. Surprisingly, t-SNE does not only provide nice 2-D drawings, but also dramatically increases the generalization power of SVM classifiers. Moreover, obtained results showed that cross-validation accuracy is much better using the 2-D embedded representation of samples than using the original high-dimensional representation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/07/2021

Cluster Analysis of Malware Family Relationships

In this paper, we use K-means clustering to analyze various relationship...
research
05/05/2014

K-NS: Section-Based Outlier Detection in High Dimensional Space

Finding rare information hidden in a huge amount of data from the Intern...
research
05/01/2023

Classification and Online Clustering of Zero-Day Malware

A large amount of new malware is constantly being generated, which must ...
research
11/04/2021

ExClus: Explainable Clustering on Low-dimensional Data Representations

Dimensionality reduction and clustering techniques are frequently used t...
research
11/13/2015

Novel Feature Extraction, Selection and Fusion for Effective Malware Family Classification

Modern malware is designed with mutation characteristics, namely polymor...
research
06/24/2016

Multipartite Ranking-Selection of Low-Dimensional Instances by Supervised Projection to High-Dimensional Space

Pruning of redundant or irrelevant instances of data is a key to every s...
research
06/01/2022

Detecting Cybercriminal Bitcoin Relationships through Backwards Exploration

Cybercriminals often leverage Bitcoin for their illicit activities. In t...

Please sign up or login with your details

Forgot password? Click here to reset