Is Data Clustering in Adversarial Settings Secure?

11/25/2018
by   Battista Biggio, et al.
0

Clustering algorithms have been increasingly adopted in security applications to spot dangerous or illicit activities. However, they have not been originally devised to deal with deliberate attack attempts that may aim to subvert the clustering process itself. Whether clustering can be safely adopted in such settings remains thus questionable. In this work we propose a general framework that allows one to identify potential attacks against clustering algorithms, and to evaluate their impact, by making specific assumptions on the adversary's goal, knowledge of the attacked system, and capabilities of manipulating the input data. We show that an attacker may significantly poison the whole clustering process by adding a relatively small percentage of attack samples to the input data, and that some attack samples may be obfuscated to be hidden within some existing clusters. We present a case study on single-linkage hierarchical clustering, and report experiments on clustering of malware samples and handwritten digits.

READ FULL TEXT

page 5

page 8

research
11/25/2018

Poisoning Behavioral Malware Clustering

Clustering algorithms have become a popular tool in computer security to...
research
11/16/2019

Suspicion-Free Adversarial Attacks on Clustering Algorithms

Clustering algorithms are used in a large number of applications and pla...
research
01/11/2023

Universal Detection of Backdoor Attacks via Density-based Clustering and Centroids Analysis

In this paper, we propose a Universal Defence based on Clustering and Ce...
research
05/31/2022

Semantic Autoencoder and Its Potential Usage for Adversarial Attack

Autoencoder can give rise to an appropriate latent representation of the...
research
05/25/2020

Adversarial Feature Selection against Evasion Attacks

Pattern recognition and machine learning techniques have been increasing...
research
05/01/2019

On the Convergence Rates of Learning-based Signature Generation Schemes to Contain Self-propagating Malware

In this paper, we investigate the importance of a defense system's learn...
research
02/15/2020

Security of HyperLogLog (HLL) Cardinality Estimation: Vulnerabilities and Protection

Count distinct or cardinality estimates are widely used in network monit...

Please sign up or login with your details

Forgot password? Click here to reset