Poisoning Behavioral Malware Clustering

11/25/2018
by   Battista Biggio, et al.
0

Clustering algorithms have become a popular tool in computer security to analyze the behavior of malware variants, identify novel malware families, and generate signatures for antivirus systems. However, the suitability of clustering algorithms for security-sensitive settings has been recently questioned by showing that they can be significantly compromised if an attacker can exercise some control over the input data. In this paper, we revisit this problem by focusing on behavioral malware clustering approaches, and investigate whether and to what extent an attacker may be able to subvert these approaches through a careful injection of samples with poisoning behavior. To this end, we present a case study on Malheur, an open-source tool for behavioral malware clustering. Our experiments not only demonstrate that this tool is vulnerable to poisoning attacks, but also that it can be significantly compromised even if the attacker can only inject a very small percentage of attacks into the input data. As a remedy, we discuss possible countermeasures and highlight the need for more secure clustering algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/25/2018

Is Data Clustering in Adversarial Settings Secure?

Clustering algorithms have been increasingly adopted in security applica...
research
11/06/2019

The Naked Sun: Malicious Cooperation Between Benign-Looking Processes

Recent progress in machine learning has generated promising results in b...
research
03/31/2020

When the Guard failed the Droid: A case study of Android malware

Android malware is a persistent threat to billions of users around the w...
research
08/12/2022

On deceiving malware classification with section injection

We investigate how to modify executable files to deceive malware classif...
research
05/04/2020

Mind the Gap: On Bridging the Semantic Gap between Machine Learning and Information Security

Despite the potential of Machine learning (ML) to learn the behavior of ...
research
04/14/2020

Topology-Aware Hashing for Effective Control Flow Graph Similarity Analysis

Control Flow Graph (CFG) similarity analysis is an essential technique f...
research
06/23/2021

Learning Explainable Representations of Malware Behavior

We address the problems of identifying malware in network telemetry logs...

Please sign up or login with your details

Forgot password? Click here to reset