1 Introduction
Clustering algorithms are nowadays a fundamental tool for the data analysts as they allow them to make inference and gain insights on large sets of unlabeled data. Applications of clustering span across a large number of different domains, such as market segmentation [14, 26], classification of web pages [10], and image segmentation [12]. In the specific domain of computer security, clustering algorithms have been recently exploited to solve plenty of different problems, e.g., spotting fastflux domains in DNS traffic [24], gaining useful insights on tools and sources of attacks against Internet websites [25], detecting repackaged Android applications [16] and (Android) mobile malware [9], and even automatically generating signatures for antivirus software to enable detection of HTTPbased malware [23].
In many of the aforementioned scenarios, a large amount of data is often collected in the wild, in an unsupervised manner. For instance, malware samples are often collected from the Internet, by means of honeypots, i.e., machines that purposely expose known vulnerabilities to be infected by malware [28], or other ad hoc services, like VirusTotal.^{1}^{1}1http://virustotal.com Given that these scenarios are intrinsically adversarial, it may thus be possible for an attacker to inject carefully crafted samples into the collected data in order to subvert the clustering process, and make the inferred knowledge useless. This raises the issue of evaluating the security of clustering algorithms against carefully designed attacks, and proposing suitable countermeasures, when required. It is worth noting that results from the literature of clustering stability [29] can not be directly exploited to this end, since the noise induced by adversarial manipulations is not generally stochastic but specifically targeted against the clustering algorithm.
The problem of learning in adversarial environments has recently gained increasing popularity, and relevant research has been done especially in the area of supervised learning algorithms for classification
[6, 8, 17, 3], and regression [13]. On the other hand, to the best of our knowledge only few works have implicitly addressed the issue of security evaluation related to the application of clustering algorithms in adversarial settings through the definition of suitable attacks, while we are not aware of any work that proposes specific countermeasures to attacks against clustering algorithms.The problem of devising specific attacks to subvert the clustering process was first brought to light by Dutrisac and Skillicorn [11, 27]. They pointed out that some points can be easily hidden within an existing cluster by forming a fringe cluster, i.e., by placing such points sufficiently close the border of the existing cluster. They further devised an attack that consists of adding points in between two clusters to merge them, based on the notion of bridging. Despite this pioneering attempts, a framework for the systematic security evaluation of clustering algorithms in adversarial settings is still missing, as well as a more general theory that takes into account the presence of the adversary to develop more secure clustering algorithms.
In this work we aim to take a first step to fill in this gap, by proposing a framework for the security evaluation of clustering algorithms, which allows us to consider several potential attack scenarios, and to devise the corresponding attacks, in a more systematic manner. Our framework, inspired from previous work on the security evaluation of supervised learning algorithms [6, 17, 3], is grounded on a model of the attacker that allows one to make specific assumptions on the adversary’s goal, knowledge of the attacked system, and capability of manipulating the input data, and to subsequently formalize a corresponding optimal attack strategy. This work is thus explicitly intended to provide a cornerstone for the development of an adversarial clustering theory, that should in turn foster research in this area.
The proposed framework for security evaluation is presented in Sect. 2. In Sect. 3 we derive worstcase attacks in which the attacker has perfect knowledge of the attacked system. In particular, we formalize the notion of (worstcase) poisoning and obfuscation attacks against a clustering algorithm, respectively in Sects. 3.1 and 3.2. In the former case, the adversary aims at maximally compromising the clustering output by injecting a number of carefully designed attack samples, whereas in the latter one, she tries to hide some attack samples into an existing cluster by manipulating their feature values, without significantly altering the clustering output on the rest of the data. As a case study, we evaluate the security of the singlelinkage hierarchical clustering against poisoning and obfuscation attacks, in Sect. 4. The underlying reason is simply that the singlelinkage hierarchical clustering has been widely used in securityrelated applications [4, 16, 23, 24]. To cope with the computational problem of deriving an optimal attack, in Sects. 4.1 and 4.2
we propose heuristic approaches that serve well our purposes. Finally, in Sect.
5 we conduct synthetic and realworld experiments that demonstrate the effectiveness of the proposed attacks, and subsequently discuss limitations and future extensions of our work in Sect. 6.2 Attacking Clustering
In this section we present our framework to analyze the security of clustering approaches from an adversarial pattern recognition perspective. It is grounded on a model of the adversary that can be exploited to identify and devise attacks against clustering algorithms. Our framework is inspired by a previous work focused on attacking (supervised) machine learning algorithms
[6], and it relies on an attack taxonomy similar to the one proposed in [17, 3]. As in [6], the adversary’s model entails the definition of the adversary’s goal, knowledge of the attacked system, and capability of manipulating the input data, according to welldefined guidelines.Before moving into the details of our framework, we introduce some notation. Clustering is the problem of organizing a set of data points into groups referred to as clusters in a way that some criteria is satisfied. A clustering algorithm can thus be formalized in terms of a function mapping a given dataset to a clustering result . We do not specify the mathematical structure of at this point of our discussion because there exist different types of clustering requiring different representations, while our model applies to any of them. Indeed, might be a hard or soft partition of
delivered by partitional clusterings algorithms such as kmeans, fuzzy cmeans or normalized cuts, or it could be a more general family of subsets of
such as the one delivered by the dominant sets clustering algorithm [22], or it can even be a parametrized hierarchy of subsets (e.g., linkagetype clustering algorithms).2.1 Adversary’s goal
Similarly to [6, 17, 3], the adversary’s goal can be defined according to the attack specificity, and the security violation pursued by the adversary. The attack specificity can be targeted, if it affects solely the clustering of a given subset of samples; or indiscriminate, if it potentially affects the clustering of any sample. Security violations can instead affect the integrity or the availability of a system, or the privacy of its users.
Integrity violations amount to performing some malicious activity without significantly compromising the normal system operation. In the supervised learning setting [17, 3], they are defined as attacks aiming at camouflaging some malicious samples (e.g., spam emails) to evade detection, without affecting the classification of legitimate samples. In the unsupervised setting, however, this definition can not be generally applied since the notion of malicious or legitimate class is not generally available. Therefore, we regard integrity violations as attacks aiming at deflecting the grouping for specific samples, while limiting the changes to the original clustering. For instance, an attacker may obfuscate some samples to hide them in a different cluster, without excessively altering the initial clusters.
Availability violations aim to compromise the functionality of the system by causing a denial of service. In the supervised setting, this translates into causing the largest possible classification error [17, 6, 7]. According to the same rationale, in the unsupervised setting we can consider attacks that significantly affect the clustering process by worsening its result as much as possible.
Finally, privacy violations may allow the adversary to obtain information about the system’s users from the clustered data by reverseengineering the clustering process.
2.2 Adversary’s knowledge
The adversary can have different degrees of knowledge of the attacked system. They can be defined by making specific assumptions on the points  described below.
Knowledge of the data : The adversary might know the data or only a portion of it. More realistically, she may not know exactly, but she may be able to obtain a surrogate dataset sampled from the same distribution as . In practice, this can be obtained by collecting samples from the same source from which samples in were collected; e.g., honeypots for malware samples [28].
Knowledge of the feature space: The adversary could know how features are extracted from each sample. Similarly to the previous case, she may know how to compute the whole feature set, or only a subset of the features.
Knowledge of the algorithm: The adversary’s could be aware of the targeted clustering algorithm and how it organizes the data into clusters; e.g., the criterion used to determine the cluster set from a hierarchy in hierarchical clustering.
Knowledge of the algorithm’s parameters: The attacker may even know how the parameters of the clustering algorithm have been initialized (if any).
Perfect knowledge. The worstcase scenario in which the attacker has full knowledge of the attacked system, is usually referred to as perfect knowledge case [6, 7, 19, 8, 17, 3]. In our case, this amounts to knowing: the data, the feature representation, the clustering algorithm, and its initialization (if any).
2.3 Adversary’s capability
The adversary’s capability defines how and to what extent the attacker can control the clustering process. In the supervised setting [17, 6], the attacker can exercise a causative or exploratory
influence, depending on whether she can control training and test data, or only test data. In the case of clustering, however, there is not a test phase in which some data has to be classified. Accordingly, the adversary may only exercise a
causative influence by manipulating part of the data to be clustered.^{2}^{2}2One may however think of an exploratory attack to a clustering algorithm as an attack in which the adversary aims to gain information on the clustering algorithm itself, although she may not necessarily manipulate any data to this end. This is often the case, though, since this data is typically collected in an unsupervised manner.We thus consider a scenario in which the attacker can add a maximum number of (potentially manipulated) samples to the dataset . This is realistic in several practical cases, e.g., in the case of malware collected through honeypots [28], where the adversary may easily send (few) samples without having access to the rest of the data. This amounts to controlling a (small) percentage of the input data. An additional constraint may be given in terms of a maximum amount of modifications that can be done to the attack samples. In fact, to preserve their malicious functionality, malicious samples like spam emails or malware code may not be manipulated in an unconstrained manner. Such a constraint can be encoded by a suitable distance measure between the original, nonmanipulated attack samples and the manipulated ones, as in [6, 20, 17, 3].
2.4 Attack strategy
Once the adversary’s goal, knowledge and capabilities have been defined, one can determine an optimal attack strategy that specifies how to manipulate the data to meet the adversary’s goal, under the restriction given by the adversary’s knowledge and capabilities. In formal terms, we denote by the knowledge space of the adversary. Elements of hold information about the dataset , the clustering algorithm , and its parametrization, according to 
. To model the degree of knowledge of the adversary we consider a probability distribution
over . The entropy of indicates the level of uncertainty of the attacker. For example, if we consider a perfectknowledge scenario like the one addressed in the next section, we have that is a Dirac measure peaked on an element (with null entropy), where holds the information about the dataset, the algorithm and any other of the informations listed in Sect.2.2. Further, we assume that the adversary is given a set of attack samples that can be manipulated before being added to the original set . We model with the function the family of sample sets that the attacker can generate according to her capability as a function of the set of initial attack samples . The set can be empty, if the attack samples are not required to fulfill any constraint on their malicious functionality, i.e., they can be generated from scratch (as we will see in the case of poisoning attacks). Finally, the adversary’s goal given the knowledge is expressed in terms of an objective function that evaluates how close the modified data set integrating the (potentially manipulated) attack samples is to the adversary’s goal. In summary, the attack strategy boils down to finding a solution to the following optimization problem:(1) 
where denotes the expectation with respect to being sampled according to the distribution .
3 Perfect knowledge attacks
In this section we provide examples of worstcase integrity and availability security violations in which the attacker has perfect knowledge of the system, as described in Sect. 2.2. We respectively refer to them as poisoning and obfuscation attacks. Since the attacker has no uncertainty about the system, we set , where is the Dirac measure and represents exact knowledge of the system. The expectation in (1) thus yields .
3.1 Poisoning attacks
Similarly to poisoning attacks against supervised learning algorithms [7, 19], we define poisoning attacks against clustering algorithms as attacks in which the data is tainted to maximally worsen the clustering result. The adversary’s goal thus amounts to violating the system’s availability by indiscriminately altering the clustering output on any data point. To this end, the adversary may aim at maximizing a given distance measure between the clustering obtained from the original data (in the absence of attack) and the clustering obtained by running the clustering algorithm on the contaminated data , and restricting the result to the samples in , i.e., where is a projection operator that restricts the clustering output to the data samples in . We regard the tainted data as the union of the original dataset with the attack samples in , i.e., . The goal can thus be written as , where is the chosen distance measure between clusterings. For instance, if is a partitional clustering algorithm, any clustering result can be represented in terms of a matrix , each component being the probability that the sample is assigned to the cluster. Under this setting, a possible distance measure between clusterings is given by:
(2) 
where is the Frobenius norm. The components of the matrix represent the probability of two samples to belong to the same cluster. When is binary, thus encoding hard clustering assignments, this distance counts the number of times two samples have been clustered together in one clustering and not in the other, or vice versa. In general, depending on the nature of the clustering result, other adhoc distance measures can be adopted.
As mentioned in Sect. 2.3, we assume that the attacker can inject a maximum of data points into the original data , i.e. . This realistically limits the adversary to manipulate only a given, potentially small fraction of the dataset. Clearly, the value of will be considered as a parameter in our evaluation to investigate the robustness of the given clustering algorithm against an increasing control of the adversary over the data. We further define a box constraint on the feature values , to restrict the attack points to lie in some fixed interval (e.g., the smallest box that includes all the data points). Hence, we define the function encoding the adversary’s capabilities as follows:
Note that depends on a set of target samples in (1), but since is empty in this case, we write instead of . The reason is simply that, in the case of a poisoning attack, the attacker aims to find a set of attack samples that do not have to carry out any specific malicious activity besides worsening the clustering process.
In summary, the optimal attack strategy under the aforementioned hypothesis amounts to solving the following optimization problem derived from (1):
(3) 
3.2 Obfuscation attacks
Obfuscation attacks are violations of the system integrity through targeted attacks. The adversary’s goal here is to hide a given set of initial attack samples within some existing clusters by obfuscating their content, possibly without altering the clustering results for the other samples. We denote by the target clustering involving samples in the attacker is aiming to, being the set of obfuscated attack samples. With the intent to preserve the clustering result on the original data samples, we impose that , while the cluster assignments for the samples in are freely determined by the attacker. As opposed to the poisoning attack, here the attacker is interested in pushing the final clustering towards the target clustering and therefore her intention is to minimize the distance between and . Accordingly, the goal function in this case is defined as .
As for the adversary’s capability, we assume that the attacker can perturb the target samples in to some maximum extent. We model this by imposing that , where is a measure of divergence between the two sets of samples and and is a nonnegative real scalar. Consequently, the function representing the attacker’s capacity is given by
The distance can be defined in different ways. For instance, in the next section we define as the largest Euclidean distance among corresponding elements in and , i.e.,
(4) 
where we assume and . This choice allows us to bound the divergence between the original target samples in and the manipulated ones, as typically done in adversarial learning [20, 17, 8, 6].
In summary, the attack strategy in the case of obfuscation attacks can be obtained as the solution of the following optimization program derived from (1):
(5) 
4 A case study on singlelinkage hierarchical clustering
In this section we solve a particular instance of the optimization problems (3) and (5), corresponding respectively to the poisoning and obfuscation attacks described in Sects. 3.1 and 3.2, against the singlelinkage hierarchical clustering. The motivation behind this specific choice of clustering algorithm is that, as mentioned in Sect. 1, it has been frequently exploited in securitysensitive tasks [4, 16, 23, 24].
Singlelinkage hierarchical clustering is a bottomup algorithm that produces a hierarchy of clusterings, as any other hierarchical agglomerative clustering algorithm [18]. The hierarchy is represented by a dendrogram, i.e., a treelike data structure showing the sequence of cluster fusion together with the distance at which each fusion took place. To obtain a given partitioning of the data into clusters, the dendrogram has to be cut at a certain height. The leaves that form a connected subgraph after the cut are considered part of the same cluster. Depending on the chosen distance between clusters (linkage criterion), different variants of hierarchical clustering can be defined. In the singlelinkage variant, the distance between any two clusters is defined as the minimum Euclidean distance between all pairs of samples in .
For both poisoning and obfuscation attacks, we will model the clustering output as a binary matrix , indicating the sampletocluster assignments (see Sect. 3.1). Consequently, we can make use of the distance measure between clusterings defined in Eq. (2). However, to obtain a given set of clusters from the dendrogram obtained by the singlelinkage clustering algorithm, we will have to specify an appropriate cut criterion.
4.1 Poisoning attacks
and posterior estimates. The
bridges obtained from the dendrogram are highlighted with red lines. The rightmost plot shows how the partitioning changes after attack samples (highlighted with red circles) have been greedily added.For poisoning attacks against singlelinkage hierarchical clustering, we aim to solve the optimization problem given by Eq. (3). As already mentioned, since the clustering is expressed in terms of a hierarchy, we have to determine a suitable dendrogram cut in order to model the clustering output as a binary matrix . In this case, we assume that the clustering algorithm selects the cut, i.e., the number of clusters, that achieves the minimum distance between the clustering obtained in the absence of attack and the one induced by the cut, i.e., . Although this may not be a realistic cut criterion, as the ideal clustering is not known to the clustering algorithm, this worstcase choice for the adversary gives us the minimum performance degradation incurred by the clustering algorithm under attack.
Let us now discuss how Problem (3) can be solved. First, note that it is not possible to predict analytically how the clustering output changes as the set of attack samples is altered, since hierarchical clustering does not have a tractable, underlying analytical interpretation.^{3}^{3}3In general, even if the clustering algorithm has a clearer mathematical formulation, it is not guaranteed that a good analytical prediction can be found. For instance, though kmeans clustering is wellunderstood mathematically, its variability to different initializations makes it almost impossible to reliably predict how its output may change due to data perturbation. One possible answer consists in a stochastic exploration of the solution space (e.g. by simulated annealing). This is essentially done by perturbing the input data a number of times, and evaluating the corresponding values of the objective function by running the clustering algorithm (as a black box) on . The set that provides the highest objective value is eventually retained. However, to find an optimal configuration of attack samples , one should repeat this procedure a very large number of times. To reduce computational complexity, one may thus consider efficient search heuristics specifically tailored to the considered clustering algorithm.
For the above reason, we consider a greedy optimization approach where the attacker aims at finding a local maximum of the objective function by adding one attack sample at a time, i.e., . In this case, we can more easily understand how the objective function changes as the inserted attack point varies, and define a suitable heuristic approach. An example is shown in the leftmost plot of Fig. 1. This plot shows that the objective function exhibits a global maximum when the attack point is added in between clusters that are sufficiently close to each other. The reason is that, when added in such a location, the attack point operates as a bridge, causing the two clusters to be merged in a single cluster, and the objective function to increase.
Bridgebased heuristic search. Based on this observation, we devised a search heuristic that considers only potential attack samples, being the actual number of clusters found by the singlelinkage hierarchical clustering at a given dendrogram cut. In particular, we only considered the points lying in between the connections that have been cut to separate the given clusters from the top of the hierarchy, highlighted in our example in the leftmost plot of Fig. 1. These connections can be directly obtained from the dendrogram, i.e., we do not have to run any postprocessing algorithm on the clustering result. Thus, one is only required to evaluate the objective function times for selecting the best attack point. We will refer to this approach as Bridge (Best) in Sect. 5.1. The rightmost plot in Fig. 1 shows the effect of our greedy attack after that attack points have been inserted. Note how the initial clusters are fragmented into smaller clusters that tend to contain points which originally belonged to different clusters.
Approximating . To further reduce the computational complexity of our approach, i.e., to avoid recomputing the clustering and the corresponding value of the objective function times for each attack point, we consider another heuristic approach. The underlying idea is simply to select the attack sample (among the bridges suggested by our bridgebased heuristic search) that lies in between the largest clusters. In particular, we assume that the attack point will effectively merge the two adjacent clusters, and thus modify accordingly (without reestimating its real value by rerunning the clustering algorithm). To this end, for each point belonging to one of the two clusters, we set to () the value of corresponding to the first (second) cluster. Once the estimated is computed, we evaluate the objective function using the estimated , and select the attack point that maximizes its value. We will refer to this approach as Bridge (Hard) in Sect. 5.1.
Approximating with soft clustering assignments. Finally, we discuss another variation to the latter discussed heuristic approach, which we will refer to as Bridge (Soft), in Sect. 5.1. The problem arises from the fact that our objective function exhibits really abrupt variations, since it is computed on hard cluster assignments (i.e., binary matrices ). Accordingly, adding a single attack point at a time may not reveal connections that can potentially merge large clusters after few attack iterations, i.e., using more than one attack sample. To address this issue, we approximate with soft clustering assignments. To this end, the element of
is estimated as the posterior probability of point
belonging to cluster , i.e., . The prior is estimated as the number of samples belonging to divided by the total number of samples, the likelihoodis estimated with a Gaussian Kernel Density Estimator (KDE) with bandwidth parameter
:(6) 
and the evidence is obtained by marginalization over the given set of clusters.
Worth noting, for too small values of , the posterior estimates tend to the same value, i.e., each point is likely to be assigned to any cluster with the same probability. When is too high, instead, each point is assigned to one cluster, and the objective function thus equals that corresponding to the original hard assignments. In our experiments we simply avoid these limit cases by selecting a value of comparable to the average distance between all possible pairs of samples in the dataset, which gave reasonable results.
An example of the smoother approximation of the objective function provided by this heuristic is shown in the middle plot of Fig. 1. Besides, this technique also provides a reliable approximation of the true objective: although its values are significantly rescaled, the global maximum is still found in the same location. The smooth variations that characterize the approximated objective influence the choice of the best candidate attack point. In fact, attack points lying on bridges that may potentially connect larger clusters after some attack iterations may be sometimes preferred to attack points that can directly connect smaller and closer clusters. This may lead to a larger increase in the true objective function as the number of injected attack points increases.
4.2 Obfuscation attacks
In this section we solve (5) assuming the worstcase (perfectknowledge) scenario against the singlelinkage clustering algorithm. Recall that the attacker’s goal in this case is to manipulate a given set of nonobfuscated samples such that they are clustered according to a desired configuration, e.g., together with points in an existing, given cluster, without altering significantly the initial clustering that would be obtained in the absence of manipulated attacks.
As in the previous case, to represent the output of the clustering algorithm as a binary matrix representing clustering assignments, and thus compute as given by Eq. 2, we have to define a suitable criterion for cutting the dendrogram. Similarly to poisoning attacks, we define an advantageous criterion for the clustering algorithm, that gives us the lowest performance degradation incurred under this attack: we select the dendrogram cut that minimizes , where represents the optimal clustering that would be obtained including the nonmanipulated attack samples, i.e., . The reason is that, to better contrast an obfuscation attack, the clustering algorithm should try to keep the attack points corresponding to the nonmanipulated set into their original clusters. For instance, in the case of malware clustering, nonobfuscated malware may easily end up in a welldefined cluster, and, thus, it may be subsequently categorized in a wellbehaved malware family. While the adversary tries to manipulate malware to have it clustered differently, the best solution for the clustering algorithm would be to obtain the same clusters that would be obtained in the absence of attack manipulation.
We derive a simple heuristic to get an approximate solution of (5) assuming to be defined as in (4). We assume that, for each sample , the attacker selects the closest sample belonging to the cluster to which should belong to, according to the attacker’s desired clustering . To meets the constraint given by in Eq. 5, the attacker then determines for each a new sample along the line connecting and in a way not to exceed the maximum distance from , i.e., , where .
5 Experiments
We present here some experiments to evaluate the effectiveness of the poisoning and obfuscation attacks devised in Sect. 4 against the singlelinkage hierarchical clustering algorithm, under perfect knowledge of the attacked system.
5.1 Experiments on poisoning attacks
For the poisoning attack, we consider three distinct cases: a twodimensional artificial data set, a realistic application example on malware clustering, and a task in which we aim to cluster together distinct handwritten digits.
Banana (20%)  Malware (5%)  Digits (1%)  

Split  Merge  Split  Merge  Split  Merge  
Random  1.15  0.22  1.29  0.06  1.00  0.00  1.00  0.00  1.00  0.00  1.00  0.00 
Random (Best)  1.40  0.34  1.54  0.30  1.00  0.00  1.34  0.39  1.00  0.00  1.00  0.00 
Bridge (Best)  2.40  0.60  1.40  0.23  1.49  0.23  1.31  0.17  33.9  0.15  1.02  0.00 
Bridge (Soft)  3.85  1.35  1.22  0.11  2.76  0.84  1.12  0.09  33.9  0.15  1.02  0.00 
Bridge (Hard)  3.75  1.43  1.21  0.23  2.41  0.73  1.10  0.10  34.0  0.00  1.02  0.00 
Split and Merge averaged values and standard deviations for the Bananashaped dataset (at 20% poisoning), the Malware dataset (at 5% poisoning), and the Digit dataset (at 1% poisoning).
5.1.1 Artificial data
We consider here the standard twodimensional bananashaped dataset from PRTools,^{4}^{4}4http://prtools.org for which a particular instance is shown in Fig. 1 (right and middle plot). We fix the number of initial clusters to , which yields our original clustering in the absence of attack.
We repeat the experiment five times, each time by randomly sampling 80 data points. In each run, we add up to attack samples, that simulates a scenario in which the adversary can control up to 20% of the data. As described in Sect. 4.1, the attack proceeds greedily by adding one sample at a time. After adding each attack sample, we allow the clustering algorithm to change the number of clusters from a minimum of to a maximum of . The criterion used to determine the number of clusters is to minimize the distance of the current partitioning with the clustering in the absence of attack, as explained in details in Sect. 4.1.
We consider five attack strategies, described in the following.
Random: the attack point is selected at random in the minimum box that encloses the data.
Random (Best): attack points are selected at random, being the actual number of clusters at a given attack iteration. Then, the objective function is evaluated for each point, and the best one is chosen.
Bridge (Best): The bridges suggested by our heuristic approach are evaluated, and the best one is chosen.
Bridge (Hard): The bridges are evaluated here by predicting the clustering output as discussed in Sect. 4.1 (i.e., assuming that the corresponding clusters will be merged), using hard clustering assignments.
Bridge (Soft): This is the same strategy as Bridge (Hard), except for the fact that we consider soft clustering assignments when modifying . To this end, as discussed in Sect. 4.1, we use a Gaussian KDE. We set the kernel bandwidth as the average distance between each possible pair of samples in the data. On average, in each run.
It is worth remarking that Random (Best) and Bridge (Best) require the objective function to be evaluated times at each iteration to select the best candidate attack sample. This means that the clustering algorithm has to be run times at each step. Instead, the other methods do not require us to rerun the clustering algorithm to select the attack point. Their complexity is therefore significantly lower than the aforementioned methods.
The results averaged over the five runs are reported in Fig. 2 (first column). From the top plot one may appreciate how the methods based on the bridgebased heuristics achieve similar values of the objective function, while clearly outperforming the randombased methods. Further, as reasonably expected, Random (Best) outperforms Random since it considers the best point over attempts. Nevertheless, even selecting a random attack sample, in this case, turned out to significantly affect the clustering results.
The bottom plot provides us a better understanding of how the attack effectively works. The main effect is indeed to fragment the original clusters into a high number of smaller clusters. In particular, after the insertion of data points, i.e., when 20% of the data is controlled by the attacker, the selected number of clusters increases from 4 to about 714 clusters depending on the considered method.
To further clarify the effect of the attack on the clustering algorithm, we consider two measures referred to as Split and Merge in Table 1, which are given as follows. Let and be the initial and the final clustering restricted to elements in , respectively, and let be a binary matrix, each entry indicating the cooccurrence of at least one sample in the th cluster of and in the th cluster of . Then, the above measures are given as:
Intuitively, split quantifies to what extent the initial clusters are fragmented across different final clusters, while merge quantifies to what extent the final clusters contain samples that originally belonged to different initial clusters.
From Table 1, it can be appreciated how, for the most effective attacks, i.e., Bridge (Soft) and Bridge (Hard), the initial clusters are split into approximately 3.8 clusters, while the final clusters merge approximately 1.2 initial clusters, on average. This clarifies how the proposed attack eventually compromises the initial clustering: it tends to fragment the initial clusters into smaller ones, and to merge together final clusters which originally came from different clusters. Bridge (Best) tends instead to induce a lower number of final clusters, i.e., the clustering algorithm tends to merge more final clusters than splitting initial ones. However, this is not the optimal choice according to the attacker’s goal.
5.1.2 Malware clustering
We consider here a more realistic application example involving malware clustering, and in particular a simplified version of the algorithm for behavioral malware clustering proposed in [23]. The ultimate goal of this approach is to obtain malware clusters that can aid the automatic generation of high quality network signatures, which can be used in turn to detect botnet commandandcontrol (C&C) and other malwaregenerated communications at the network perimeter. With respect to the original algorithm, we made the following simplifications:

[leftmargin=*]

we consider only the first of the two clustering steps carried out by the original system. The algorithm proposed in [23] clusters samples through two consecutive stages, named coarsegrain and finegrain clustering, respectively. Here, we just focus on the coarsegrain clustering, which is based on a set of numeric features.

We consider a subset of six statistical features (out of the seven used by the original algorithm). They are: number of GET requests; number of POST requests; average length of the URLs; average number of parameters in the request; average amount of data sent by POST requests; and average length of the response. We exclude the seventh feature, i.e., the total number of HTTP requests, as it is redundant with respect to the first and the second feature. All feature values are rescaled in as in the original work.
For the purpose of this evaluation, we use a subset of 1,000 samples taken from Dataset 1 of [23]. This dataset consists of distinct malware samples (no duplicates) collected during March 2010 from a number of different malware sources, including MWCollect [1], Malfease [2], and commercial malware feeds. As in the previous setting, we repeat the experiments five times, by randomly selecting a subset of samples from the available set of malware data in each run. The initial set of clusters , as in [23], is selected as the partitioning that minimizes the value of the DaviesBouldin Index (DBI) [15], a measure that characterizes dispersion and closeness of clusters. We consider the cuts of the initial dendrogram that yield from 2 to 25 clusters, and choose the one corresponding to the minimum DBI. This yields approximately clusters in each run. While the attack proceeds, the clustering algorithm can choose a number of clusters ranging from to . The attacker can inject up to 25 attack samples, that amounts to controlling up to 5% of the data. The value of for the KDE used in Bridge (Soft) is set as the average distance between pairs of samples, which turns out to be approximately in each run.
Results are shown in Fig. 2 (second column). The effect of the attack is essentially the same as in the previous experiments on the Bananashaped data, although here there is a significant difference among the performances of the bridgebased methods. In particular, Bridge (Soft) gradually outperforms the other approaches as the fraction of injected samples approaches 5%. The reason is that, as qualitatively discussed in Sect. 4.1, this heuristic approach tends to bridge clusters which are too far to be bridged with a single attack point, and are thus disregarded by Bridge (Best) and not always chosen by Bridge (Hard). It is also worth noting that, in this case, the Random approach is totally ineffective. In particular, no change in the objective function is observed for this method, and the number of clusters increases linearly as the attack proceeds. This means simply that the clustering algorithm produces a new cluster for each newlyinjected attack point, making the attack totally ineffective. The behavior exhibited by the different attack strategies is also confirmed by the Split and Merge values reported in Table 1. Here, the most effective methods, i.e., again Bridge (Soft) and Bridge (Hard), split the 3 initial clusters each into 2.7 and 2.4 final clusters, on average, yielding a total number of clusters of about 2025 clusters. Similarly to the previous experiments, Bridge (Best) yields a lower number of final clusters, as it induces more the clustering algorithm to cluster together samples that originally belonged to different initial clusters.
5.1.3 Handwritten digits
We finally repeat the experiments described in the previous sections on the MNIST handwritten digit data [21].^{5}^{5}5This dataset is publicly available in Matlab format at http://cs.nyu.edu/~roweis/data.html. In this dataset, each digit is sizenormalized and centered, and represented as a grayscale image of pixels. Each pixel is rasterscan ordered and its value is directly considered as a feature. The dimensionality of the feature space is thus , a much higher value than that considered in the previous cases. We further normalize each feature (pixel) in by dividing its value by .
We focus here on a subset of data consisting of the three digits ‘0’, ‘1’, and ‘6’. To obtain three initial clusters, each representing one of the considered digits, we first compute the average digit for each class (i.e., the average ‘0’, ‘1’, and ‘6’), and then select samples per class, by retaining the closest samples to the corresponding average digit. We repeat the experiments five times, each time by randomly selecting samples per digit from the corresponding set of preselected samples. While the attack proceeds, the clustering algorithm can choose a number of clusters ranging from to . We assume that the attacker can inject up to attack samples, that amounts to controlling up to 1% of the data. The value of for the KDE used in Bridge (Soft) is set as in the previous case, based on the average distance between all pairs of samples. For this dataset, it turns out that in each run.
Results are shown in Fig. 2 (third column). With respect to the previous experiments on the Bananashaped data, and on the Malware data, the results here are significantly different. In particular, note how the Random and Random (Best) approaches are totally ineffective here. Similarly to the previous case in malware clustering, the clustering algorithm essentially defeats the attack influence by creating a new cluster for each attack sample. The underlying reason is that, in this case, the feature space has a very high dimensionality, and, thus, sampling only points at random is not enough to find a suitable attack point. In other words, if an attack sample is not very well crafted, it may be easily isolated from the rest of the data. Although increasing the dimensionality may thus seem a suitable countermeasure to protect clustering against random attacks, this drastically increases its vulnerability to well designed attack samples. Note indeed how the clustering is already significantly worsened when the adversary only controls a fraction as small as of 0.2% of the data. In fact, the number of final clusters raises immediately to the maximum allowed number of 100. This is also clarified in Table 1, where it can be appreciated how the initial clusters are fragmented into an average of 33 final clusters for the bridgebased methods. Note however that, in this case, the final clusters are almost pure, i.e., the attack algorithm does not succeed in merging together samples coming from different initial clusters.
In Fig. 3 we also show some of the attack samples that are produced by the five attack strategies, at different attack iterations. The randombased attacks clearly produce very noisy images which yield a completely ineffective attack, as already mentioned. Instead, the initial attacks considered by bridgebased methods (at iteration 1 and 2) resemble effectively the digits corresponding to the two initial clusters that they aim to connect (‘0’ and ‘6’, and ‘1’ and‘6’). Since the attack completely destroys the three initial clusters after very few attack samples have been added, at later iterations (e.g., iteration 10), the bridgebased methods tend to enforce some connection within the cluster belonging to the ‘0’ digit, probably trying to merge some of the final clusters together. However, since the maximum number of allowed clusters has been already reached, no further improvement is observed in the objective function.
5.2 Experiments on obfuscation attacks
For the obfuscation attack, we present an experiment on handwritten digits, using again the MNIST digit data described in Sect. 5.1.3.
5.2.1 Handwritten digits
We consider the same initial clusters of Sect. 5.1.3, consisting of 330 samples for each of the following digits: ‘0’, ‘1’, and ‘6’. As in the previous case, we average the results over five runs, each time selecting the initial 330 samples per cluster from the preselected sets of 700 samples per digit. In this case, however, we consider a further initial cluster of 100 samples corresponding to the digit ‘3’ (which are also randomly sampled from a preselected set of 700 samples of ‘3’, chosen with the same criterion used in Sect. 5.1.3 to end up in the same cluster, initially). These represent the attack samples that the attacker aims to obfuscate. We remind the reader that the attacker’s goal in this case is to manipulate some samples to have them clustered according to a desired criterion, without affecting significantly the normal system operation. In particular, we assume here that the attacker can manipulate samples corresponding to the digit ‘3’ in order to have them clustered together with the cluster corresponding to the digit ‘6’, while preserving the initial clusters. In other words, the desired clustering output for the attacker consists of three clusters: one corresponding to the ‘0’ digit, one corresponding to the ‘1’ digit, and the latter corresponding to the digits ‘6’ and ‘3’. These constraints can be easily encoded as a desired clustering output through a binary matrix . This reflects exactly Problem 5, where the attacker aims at minimizing .
On the other hand, as explained in Sect. 3.2, the clustering algorithm attempts to keep the attack points corresponding to the digit ‘3’ into a wellseparated cluster from the remaining digits, i.e., it selects the number of clusters that minimizes , which can thus be regarded as the objective function for the clustering algorithm. In this case, is the clustering obtained on the initial data and the nonmanipulated attack samples, i.e., .
The results for the above discussed obfuscation attack are given in Fig. 4, where we report the values of the objective function for the attacker and for the clustering algorithm, as well as the number of selected clusters, as a function of the maximum amount of allowed modifications to the attack samples, given in terms of the maximum Euclidean distance (see Eq. 4). The results clearly show that the objective function of the attacker tends to decrease, while that of the clustering algorithm generally increases. The reason is that, initially, the clustering algorithm correctly separates the four clusters associated to the four distinct digits, whereas as increases, the attack digits ‘3’ are more and more altered to resemble the closest ‘6’s, and are then gradually merged to their cluster. The number of clusters does not decrease immediately to as one would expect since, while manipulating the attack samples, their cluster is fragmented into smaller ones (typically, two or three clusters). The reason is that, to remain as close as possible to the ideal , the clustering algorithm avoids some of the ‘3’s to immediately join the cluster of ‘6’s by fragmenting the cluster of ‘3’s.
When takes on values approximately in , the clustering algorithm creates only three clusters, corresponding effectively to the attacker’s goal (this is witnessed by the fact that the averaged attacker’s objective is almost zero). Surprisingly, though, as soon as becomes greater than 4, the number of clusters raises again to 4, and some of the attack samples are again separated from the cluster of ‘6’s, worsening the adversary’s objective. This is due to the fact that, when or , some of the attack points work as bridges and successfully connect the remaining ‘3’s to the cluster of ‘6’s, whereas when these points are further shifted towards the cluster of ‘6’s, the algorithm can successfully split the two clusters again. Based on this observation, a smarter attacker may even manipulate only a very small subset of her attack samples to create proper bridges and connect the remaining nonmanipulated samples to the desired cluster. We however left a quantitatively investigation of this approach to future work.
In Fig. 5 we finally report an example of how a digit ‘3’ is manipulated by our attack to be hidden in the cluster associated to the digit ‘6’. It is worth noting how, when , the original attack sample still significantly resembles the initial ‘3’: this shows that the adversary’s goal can be achieved without altering too much the initial attack samples, which is clearly a strong desideratum for the attacker in adversarial settings.
6 Conclusions and future work
In this paper, we addressed the problem of evaluating the security of clustering algorithms in adversarial settings, by providing a framework for simulating potential attack scenarios. We devised two attacks that can significantly compromise availability and integrity of the targeted system. We demonstrated with realworld experiments that singlelinkage clustering may be significantly vulnerable to deliberate attacks, either when the adversary can only control a very small fraction of the input data, or when she slightly manipulates her attack samples. This shows that attacking clustering algorithms with tailored strategies can significantly alter their output to meet the adversary’s goal.
Admittedly, one of the causes of the vulnerability of singlelinkage resides in its intercluster distance, which solely depends on the closest points between clusters, and thus allowed for an efficient constructing of bridges. It is reasonable to assume that algorithms based on computing averages (e.g., kmeans) or density estimation might be more robust to poisoning, although not necessarily robust to obfuscation attacks. However, the results of our empirical evaluation can not be directly generalized to different algorithms, and more investigation should thus be carried out in this respect.
In general, finding the optimal attack strategy given an arbitrary clustering algorithm is a hard problem and we have to rely on heuristic algorithms in order to carry out our analysis. For the sake of efficiency, these heuristics should be heavily dependent on the targeted clustering algorithm, as in our case. However, it would be interesting to exploit more general approaches that ideally treat the clustering algorithm as a black box and find a solution by performing a stochastic search on the solution space (e.g. by simulated annealing), or an educated exhaustive search (e.g. by using branchandbound techniques).
In this work we did not address the problem of countering attacks by designing more secure clustering algorithms. We only assumed that the clustering algorithm can select a different number of clusters (optimal according to its goal) after each attack iteration. More generally, though, one can design a clustering algorithm that explicitly takes into account the adversary’s presence, and her optimal attack strategy, e.g., by modeling clustering in adversarial settings as a game between the clustering algorithm and the attacker. This has been done in the case of supervised learning, to improve the security of learning algorithms against evasion attempts [8], and similarly, in the regression setting [13]. Other approaches may more directly encode explicit assumptions on how the data distribution changes under attack, similarly to [5]. We left this investigation to future work.
Another possible future extension of our work would be to consider a more realistic setting in which the attacker has limited knowledge of the attacked system. To this end, the upper bound on the performance degradation incurred under attack provided by our worstcase analysis may be exploited to evaluate the effectiveness of attacks devised under limited knowledge (i.e., how close they can get to the worst case).
One limitation of our approach may be the socalled inverse featuremapping problem [17, 6], i.e.
, the problem of finding a real attack sample corresponding to a desired feature vector (as the ones suggested by our attack strategies). In the reported experiments, this was not a significant problem since modifications to the given feature values could be directly mapped to manipulations on the
real attack samples. Although inverting the feature mapping may be a cumbersome task for more complicated feature representations, this remains a common problem of optimal attacks in adversarial learning, and it has to be addressed in an applicationspecific manner, depending on the given feature space.As a further future development, we plan to establish a link between the evaluation of the security of clustering algorithms and the problem of determining the stability of a clustering, which has been already addressed in the literature and used as a device for model selection (see, e.g., [29]). Indeed, stable clusterings can be regarded as secure under specific nontargeted attacks like, e.g., perturbation of points with Gaussian noise.
Understanding robustness of clustering algorithms against carefully targeted attacks under a more theoretical perspective (e.g., by devising theoretical bounds that evaluate the impact of single attack points on the clustering output) may also be a promising research direction. Some results from clustering stability may be also exploited to this end.
7 Acknowledgments
This work has been partly supported by the Regional Administration of Sardinia (RAS), Italy, within the projects “Security of pattern recognition systems in future internet” (CRP18293), and “Advanced and secure sharing of multimedia data over social networks in the future Internet” (CRP17555). Both projects are funded within the framework of the regional law L.R. 7/2007, Bando 2009. The opinions, findings and conclusions expressed in this paper are solely those of the authors and do not necessarily reflect the opinions of any sponsor.
References
 [1] Collaborative Malware Collection and Sensing. https://alliance.mwcollect.org.
 [2] Project Malfease. http://malfease.oarci.net.
 [3] M. Barreno, B. Nelson, R. Sears, A. D. Joseph, and J. D. Tygar. Can machine learning be secure? In ASIACCS ’06: Proc. 2006 ACM Symposium on Information, Computer and Communications Security, pages 16–25, NY, USA, 2006. ACM.
 [4] U. Bayer, P. M. Comparetti, C. Hlauschek, C. Krügel, and E. Kirda. Scalable, behaviorbased malware clustering. In NDSS. The Internet Society, 2009.
 [5] B. Biggio, G. Fumera, and F. Roli. Design of robust classifiers for adversarial environments. In IEEE Int’l Conf. Sys., Man, and Cyber., pages 977–982, 2011.
 [6] B. Biggio, G. Fumera, and F. Roli. Security evaluation of pattern classifiers under attack. IEEE Trans. on Knowledge and Data Eng., 99(PrePrints):1, 2013.

[7]
B. Biggio, B. Nelson, and P. Laskov.
Poisoning attacks against support vector machines.
In J. Langford and J. Pineau, editors, 29th Int’l Conf. on Machine Learning. Omnipress, 2012.  [8] M. Brückner, C. Kanzow, and T. Scheffer. Static prediction games for adversarial learning problems. J. Mach. Learn. Res., 13:2617–2654, 2012.
 [9] I. Burguera, U. Zurutuza, and S. NadjmTehrani. Crowdroid: behaviorbased malware detection system for android. In Proc. 1st ACM workshop on Security and Privacy in Smartphones and Mobile devices, SPSM ’11, pages 15–26, NY, USA, 2011. ACM.
 [10] C. Castillo and B. D. Davison. Adversarial web search. Foundations and Trends in Information Retrieval, 4(5):377–486, May 2011.
 [11] J. G. Dutrisac and D. Skillicorn. Hiding clusters in adversarial settings. In IEEE Int’l Conf. on Intell. Security Informatics (ISI), pages 185–187, 2008.
 [12] D. A. Forsyth and J. Ponce. Computer Vision: A Modern Approach. Prentice Hall, 2011.
 [13] M. Großhans, C. Sawade, M. Brückner, and T. Scheffer. Bayesian games for adversarial regression problems. In J. Mach. Learn. Res.  Proc. 30th Int’l Conf. on Machine Learning (ICML), volume 28, 2013.
 [14] P. Haider, L. Chiarandini, and U. Brefeld. Discriminative clustering for market segmentation. In 18th Int’l Conf. Knowl. Disc. Data Mining, KDD ’12, pages 417–425, NY, USA, 2012. ACM.
 [15] M. Halkidi, Y. Batistakis, and M. Vazirgiannis. On clustering validation techniques. Journal of Intelligent Information Systems, 17(23):107–145, Dec. 2001.
 [16] S. Hanna, L. Huang, E. Wu, S. Li, C. Chen, and D. Song. Juxtapp: a scalable system for detecting code reuse among android applications. In Proc. 9th Int’l Conf. on Detection of Intrusions and Malware, and Vulnerability Assessment, DIMVA’12, pages 62–81, Berlin, Heidelberg, 2013. SpringerVerlag.
 [17] L. Huang, A. D. Joseph, B. Nelson, B. Rubinstein, and J. D. Tygar. Adversarial machine learning. In 4th ACM Workshop on Artificial Intelligence and Security (AISec 2011), pages 43–57, Chicago, IL, USA, 2011.
 [18] A. K. Jain and R. C. Dubes. Algorithms for clustering data. PrenticeHall, Inc., NJ, USA, 1988.

[19]
M. Kloft and P. Laskov.
Online anomaly detection under adversarial impact.
In Proc. 13th Int’l Conf. on Artificial Intell. and Statistics, pages 405–412, 2010.  [20] A. Kolcz and C. H. Teo. Feature weighting for improved classifier robustness. In Sixth Conf. on Email and AntiSpam (CEAS), CA, USA, 2009.
 [21] Y. LeCun, L. Jackel, L. Bottou, A. Brunot, C. Cortes, J. Denker, H. Drucker, I. Guyon, U. Müller, E. Säckinger, P. Simard, and V. Vapnik. Comparison of learning algorithms for handwritten digit recognition. In Int’l Conf. ANNs, pages 53–60, 1995.
 [22] M. Pavan and M. Pelillo. Dominant sets and pairwise clustering. IEEE Trans. on Pattern Analysis and Machine Intelligence, 29(1):167–172, 2007.
 [23] R. Perdisci, D. Ariu, and G. Giacinto. Scalable finegrained behavioral clustering of httpbased malware. Computer Networks, 57(2):487 – 500, 2013.
 [24] R. Perdisci, I. Corona, and G. Giacinto. Early detection of malicious flux networks via largescale passive DNS traffic analysis. IEEE Trans. on Dependable and Secure Comp., 9(5):714–726, 2012.
 [25] F. Pouget, M. Dacier, J. Zimmerman, A. Clark, and G. Mohay. Internet attack knowledge discovery via clusters and cliques of attack traces. J. Information Assurance and Security, Vol. 1, Issue 1, March 2006.
 [26] G. Punj and D. W. Stewart. Cluster analysis in marketing research: Review and suggestions for application. J. Marketing Res., 20(2):134, May 1983.
 [27] D. B. Skillicorn. Adversarial knowledge discovery. IEEE Intelligent Systems, 24:54–61, 2009.
 [28] L. Spitzner. Honeypots: Tracking Hackers. AddisonWesley Professional, 2002.
 [29] U. von Luxburg. Clustering stability: An overview. Foundations and Trends in ML, 2(3):235–274, 2010.
 [30] T. Zhang, R. Ramakrishnan, and M. Livny. Birch: an efficient data clustering method for very large databases. In Proc. 1996 ACM SIGMOD Int’l Conf. on Management of data, SIGMOD ’96, pages 103–114, NY, USA, 1996. ACM.