Adversarial Ranking Attack and Defense

02/26/2020 ∙ by Mo Zhou, et al. ∙ Xi'an Jiaotong University 0

Deep Neural Network (DNN) classifiers are vulnerable to adversarial attack, where an imperceptible perturbation could result in misclassification. However, the vulnerability of DNN-based image ranking systems remains under-explored. In this paper, we propose two attacks against deep ranking systems, i.e., Candidate Attack and Query Attack, that can raise or lower the rank of chosen candidates by adversarial perturbations. Specifically, the expected ranking order is first represented as a set of inequalities, and then a triplet-like objective function is designed to obtain the optimal perturbation. Conversely, a defense method is also proposed to improve the ranking system robustness, which can mitigate all the proposed attacks simultaneously. Our adversarial ranking attacks and defense are evaluated on datasets including MNIST, Fashion-MNIST, and Stanford-Online-Products. Experimental results demonstrate that a typical deep ranking system can be effectively compromised by our attacks. Meanwhile, the system robustness can be moderately improved with our defense. Furthermore, the transferable and universal properties of our adversary illustrate the possibility of realistic black-box attack.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Despite the successful application in computer vision tasks such as image classification 

[35, 25], Deep Neural Networks (DNNs) have been found vulnerable to adversarial attacks. In particular, the DNN’s prediction can be arbitrarily changed by just applying an imperceptible perturbation to the input image [75, 21]. Moreover, such adversarial attacks can effectively compromise the recent state-of-the-art DNNs such as Inception [73, 74] and ResNet [25]

. This poses a serious security risk on many DNN-based applications such as face recognition, where recognition evasion or impersonation can be easily achieved 

[15, 70, 34, 78].

Figure 1: Adversarial ranking attack that can raise or lower the rank of chosen candidates by adversarial perturbations. In Candidate Attack, adversarial perturbation is added to the candidate and its rank is raised (CA+) or lowered (CA-). In Query Attack, adversarial perturbation is added to the query image, and the ranks of chosen candidates are raised (QA+) or lowered (QA-).

Previous adversarial attacks primarily focus on classification, however, we speculate that DNN-based image ranking systems [5, 8, 76, 33, 56, 19, 39]

also suffer from similar vulnerability. Taking the image-based product search as an example, a fair ranking system should rank the database products according to their visual similarity to the query, as shown in Fig. 

1 (row 1). Nevertheless, a malicious seller may attempt to raise the rank of his/her own product by adding perturbation to the image (CA+, row 2), or lower the rank of his competitor’s product (CA-, row 3); Besides, a “man-in-the-middle” attacker (e.g., a malicious advertising company) could hijack and imperceptibly perturb the query image in order to promote (QA+, row 4) or impede (QA-, row 5) the sales of specific products.

Unlike classification tasks where images are predicted independently, the rank of one candidate is related to the query as well as other candidates for image ranking. The relative relations among candidates and queries determine the final ranking order. Therefore, we argue that the existing adversarial classification attacks are incompatible with the ranking scenario. Thus, we need to thoroughly study the adversarial ranking attack.

In this paper, adversarial ranking attack aims to raise or lower the ranks of some chosen candidates with respect to a specific query set . This can be achieved by either Candidate Attack (CA) or Query Attack (QA). In particular, CA is defined as to raise (abbr. CA+) or lower (abbr. CA-) the rank of a single candidate with respect to the query set by perturbing itself; while QA is defined as to raise (abbr. QA+) or lower (abbr. QA-) the ranks of a candidate set with respect to a single query by perturbing . Thus, adversarial ranking attack can be achieved by performing CA on each , or QA on each . In practice, the choice of CA or QA depends on the accessibility to the candidate or query respectively, i.e., CA is feasible for modifiable candidate, while QA is feasible for modifiable query.

An effective implementation of these attacks is proposed in this paper. As we know, a typical DNN-based ranking model maps objects (i.e., queries and candidates) to a common embedding space, where the distances among them determine the final ranking order. Predictably, the object’s position in the embedding space will be changed by adding a perturbation to it. Therefore, the essential of adversarial ranking attack is to find a proper perturbation, which could push the object to a desired position that leads to the expected ranking order. Specifically, we first represent the expected ranking order as a set of inequalities. Subsequently, a triplet-like objective function is designed according to those inequalities, and combined with Projected Gradient Descent (PGD) to efficiently obtain the desired adversarial perturbation.

Opposed to the proposed attacks, adversarial ranking defense is worth being investigated especially for security-sensitive deep ranking applications. Until now, the Madry defense [50] is regarded as the most effective method for classification defense. However, we empirically discovered a primary challenge of diverging training loss while directly adapting such mechanism for ranking defense, possibly due to the generated adversarial examples being too “strong”. In addition, such defense mechanism needs to defend against distinct ranking attacks individually, but a generic defense method against all CA+, CA-, QA+ and QA- attacks is preferred.

To this end, a shift-distance based ranking defense is proposed, which could simultaneously defend against all attacks. Note that the position shift of objects in the embedding space is the key for all ranking attacks. Although different attacks prefer distinct shift directions (e.g., CA+ and CA- often prefer opposed shifting directions), a large shift distance is their common preference. If we could reduce the shift distance of embeddings incurred by adversarial perturbation, all attacks can be simultaneously defensed. Specifically, we first propose a shift-distance based ranking attack, which aims to push the objects as far from their original positions as possible. And then, the adversarial examples generated from such attack is involved in the adversarial training. Experimental results manifest that our ranking defense can converge and moderately improve model robustness.

In addition, our ranking attacks have some good properties for realistic applications. First, our adversary is transferable, i.e., the adversary obtained from a known DNN ranker can be directly used to attack an unknown DNN ranker (i.e., the network architecture and parameters are unknown). Second, our attacks can be extended to universal ranking attacks with slight performance drop, i.e., we could learn a universal perturbation to all candidates for CA, or a universal perturbation to all queries for QA. Such properties illustrate the possibility of practical black-box attack.

To the best of our knowledge, this is the first work that thoroughly studies the adversarial ranking attack and defense. In brief, our contributions are:

  1. [noitemsep,leftmargin=*]

  2. The adversarial ranking attack is defined and implemented, which can intentionally change the ranking results by perturbing the candidates or queries.

  3. An adversarial ranking defense method is proposed to improve the ranking model robustness, and mitigate all the proposed attacks simultaneously.

2 Related Works

Adversarial Attacks. Szegedy et al[75] claimed that DNN is susceptible to imperceptible adversarial perturbations added to inputs, due to the intriguing “blind spot” property, which was later ascribed to the local linearity [21] of neural networks. Following these findings, many white-box (model architecture and parameters are known to the adversary) attacking methods [54, 61, 36, 7, 10, 13, 66, 72, 50, 80, 9, 20] are proposed to effectively compromise the state-of-the-art DNN classifiers. Among them, PGD [50] is regarded as one of the most powerful attacks [2]. Notably, adversarial examples are discovered to be transferable [60, 59] among different neural network classifiers, which inspired a series of black-box attacks [71, 79, 83, 45, 14, 28]. On the other hand, universal (i.e., image-agnostic) adversarial perturbations are also discovered [53, 41]. The existence of adversarial examples stimulated research interests in areas such as object detection [48, 11, 85], semantic segmentation [1]

, and automatic speech recognition 

[65], etc. It is even possible to create physical adversarial examples  [70, 4, 18, 78].

Deep Ranking. Different from the traditional “learning to rank” [42, 31] methods, DNN-based ranking methods often embed data samples (including both queries and candidates) of all modalities into a common embedding space, and subsequently determine the ranking order based on distance. Such workflow has been adopted in distance metric learning [8, 76, 57, 30], image retrieval [5], cross-modal retrieval [56, 19, 39, 33], and face recognition [67].

Adversarial Attacks in Deep Ranking. For information retrieval and ranking systems, the risk of malicious users manipulating the ranking always exists [23, 27]. However, only a few research efforts have been made in adversarial attacks in deep ranking. Liu et al[46] proposed adversarial queries leading to incorrect retrieval results; while Li et al[40] staged similar attack with universal perturbation that corrupts listwise ranking results. None of the aforementioned research efforts explore the adversarial ranking attack.

Adversarial Defenses. Adversarial attacks and defenses are consistently engaged in an arms race [84]. Gradient masking-based defenses can be circumvented [3]

. Defensive distillation 

[58, 62] has been compromised by C&W [7, 6]. As claimed in [26], ensemble of weak defenses are insufficient against adversarial examples. Notably, as an early defense method [75], adversarial training [21, 50, 29, 16, 37, 69, 77, 86, 55] remains to be one of the most effective defenses. Other types of defenses include adversarial detection [47, 52], input transformation/reconstruction/replacement [64, 49, 24, 51, 17], randomization [44, 43], feature denoising [82], network verification [32, 22]

, evidential deep learning 

[68], etc. However, defense in deep ranking systems remains mostly uncharted.

3 Adversarial Ranking

Generally, a DNN-based ranking task could be formulated as a metric learning problem. Given the query and candidate set , deep ranking is to learn a mapping

(usually implemented as a DNN) which maps all candidates and query into a common embedding space, such that the relative distances among the embedding vectors could satisfy the expected ranking order. For instance, if candidate

is more similar to the query than candidate , it is encouraged for the mapping to satisfy the inequality 111Sometimes cosine distance is used instead., where denotes norm. For brevity, we denote as in following text.

Therefore, adversarial ranking attack is to find a proper adversarial perturbation which leads the ranking order to be changed as expected. For example, if a less relevant is expected to be ranked ahead of a relevant , it is desired to find a proper perturbation to perturb , i.e. , such that the inequality could be changed into . In the next, we will describe Candidate Attack and Query Attack in detail.

3.1 Candidate Attack

Candidate Attack (CA) aims to raise (abbr. CA+) or lower (abbr. CA-) the rank of a single candidate with respect to a set of queries by adding perturbation to the candidate itself, i.e. .

Let denote the rank of the candidate with respect to the query , where indicates the set of all candidates, and a smaller rank value represents a higher ranking. Thus, the CA+ that raises the rank of with respect to every query by perturbation could be formulated as the following problem,

(1)
(2)

where is a -bounded -neighbor of , is a predefined small positive constant, the constraint limits the perturbation to be “visually imperceptible”, and ensures the adversarial example remains a valid input image. Although alternative “imperceptible” constraints exist (e.g.,  [72, 12],  [10] and  [7, 54] variants), we simply follow [21, 36, 50] and use the constraint.

However, the optimization problem Eq. (1)–(2) cannot be directly solved due to the discrete nature of the rank value . In order to solve the problem, a surrogate objective function is needed.

In metric learning, given two candidates where is ranked ahead of , i.e. , the ranking order is represented as an inequality and formulated in triplet loss:

(3)

where denotes , and is a manually defined constant margin. This function is known as the triplet (i.e. pairwise ranking) loss [8, 67].

Similarly, the attacking goal of CA+ in Eq. (1) can be readily converted into a series of inequalities, and subsequently turned into a sum of triplet losses,

(4)

In this way, the original problem in Eq. (1)–(2) can be reformulated into the following constrained optimization problem:

(5)

To solve the optimization problem, Projected Gradient Descent (PGD) method [50, 36] (a.k.a the iterative version of FGSM [21]) can be used. Note that PGD is one of the most effective first-order gradient-based algorithms [2], popular among related works about adversarial attack.

Specifically, in order to find an adversarial perturbation to create a desired adversarial candidate , the PGD algorithm alternates two steps at every iteration . Step one updates according to the gradient of Eq. (4); while step two clips the result of step one to fit in the -neighboring region :

(6)

where is a constant hyper-parameter indicating the PGD step size, and is initialized as . After iterations, the desired adversarial candidate is obtained as , which is optimized to satisfy as many inequalities as possible. Each inequality represents a pairwise ranking sub-problem, hence the adversarial candidate will be ranked ahead of other candidates with respect to every specified query .

Likewise, the CA- that lowers the rank of a candidate with respect to a set of queries can be obtained in similar way:

(7)

3.2 Query Attack

Query Attack (QA) is supposed to raise (abbr. QA+) or lower (abbr. QA-) the rank of a set of candidates with respect to the query , by adding adversarial perturbation to the query . Thus, QA and CA are two “symmetric” attacks. The QA- for lowering the rank could be formulated as follows:

(8)

where is the -neighbor of . Likewise, this attacking objective can also be transformed into the following constrained optimization problem:

(9)
(10)

and it can be solved with the PGD algorithm. Similarly, the QA+loss function for raising the rank of is as follows:

(11)

Unlike CA, QA perturbs the query image, and hence may drastically change its semantics, resulting in abnormal retrieval results. For instance, after perturbing a “lamp” query image, some unrelated candidates (e.g., “shelf”, “toaster”, etc.) may appear in the top return list. Thus, an ideal query attack should preserve the query semantics, i.e., the candidates in 222The complement of the set . should retain their original ranks if possible. Thus, we propose the Semantics-Preserving Query Attack (SP-QA) by adding the SP term to mitigate the semantic changes , e.g.,

(12)

where , i.e., contains the top- most-relevant candidates corresponding to , and the term helps preserve the query semantics by retaining some candidates in the retrieved ranking list. Constant is a predefined integer; and constant is a hyper-parameter for balancing the attack effect and semantics preservation. Unless mentioned, in the following text QA means SP-QA by default.

3.3 Robustness & Defense

Adversarial defense for classification has been extensively explored, and many of them follows the adversarial training mechanism [29, 37, 50]. In particular, the adversarial counterparts of the original training samples are used to replace or augment the training samples. Until now, Madry defense [50] is regarded as the most effective [77, 3] adversarial training method. However, when directly adapting such classification defense to improve ranking robustness, we empirically discovered a primary challenge of diverging training loss, possibly due to the generated adversarial examples being too “strong”. Moreover, such defense mechanism needs to defend against distinct attacks individually. Therefore, a generic defense against all the proposed attacks is preferred.

Note that the underlying principle of adversarial ranking attack is to shift the embeddings of candidates/queries to a proper place, and a successful attack depends on a large shift distance as well as a correct shift direction. A large shift distance is an indispensable objective for all the CA+, CA-, QA+ and QA- attacks. Predictably, a reduction in shift distance could improve model robustness against all attacks simultaneously.

To this end, we propose a “maximum-shift-distance” attack that pushes an embedding vector as far from its original position as possible (resembles Feature Adversary [66] for classification), . Then we use adversarial examples obtained from this attack to replace original training samples for adversarial training, hence reduce the shift distance incurred by adversarial perturbations.

A ranking model can be normally trained with the defensive version of the triplet loss:

(13)

Unlike the direct adaptation of Madry defense, the training loss does converge in our experiments.

4 Experiments

CA+ CA- QA+ QA-
(CT) Cosine Distance, Triplet Loss (R@1=99.1%)
0 50 50 50 50 2.1 2.1 2.1 2.1 50 50 50 50 0.5 0.5 0.5 0.5
0.01 44.6 45.4 47.4 47.9 3.4 3.2 3.1 3.1 45.2 46.3 47.7 48.5 0.9 0.7 0.6 0.6
0.03 33.4 37.3 41.9 43.9 6.3 5.9 5.7 5.6 35.6 39.2 43.4 45.8 1.9 1.4 1.1 1.1
0.1 12.7 17.4 24.4 30.0 15.4 14.9 14.8 14.7 14.4 21.0 30.6 37.2 5.6 4.4 3.7 3.5
0.3 2.1 9.1 13.0 17.9 93.9 93.2 93.0 92.9 6.3 11.2 22.5 32.1 8.6 6.6 5.3 4.8
Table 1: Adversarial ranking attack on vanilla model with MNIST. The “+” attacks (i.e. CA+ and QA+) raise the rank of chosen candidates towards (); while the “-” attacks (i.e. CA- and QA-) lower the ranks of chosen candidates towards (). Applying QA+ attacks on the model, the SP term keeps the ranks of no larger than , respectively, regardless of . With the QA- counterpart, the ranks of are kept no larger than , respectively, regardless of . For all the numbers in the table, “%” is omitted.

To validate the proposed attacks and defense, we use three commonly used ranking datasets including MNIST [38], Fashion-MNIST [81], and Stanford Online Product (SOP) [57]

. We respectively train typical vanilla models on these datasets with PyTorch 

[63], and conduct attacks on their corresponding validation datasets (used as ).

Evaluation Metric. Adversarial ranking attack aims to change the ranks of candidates. For each candidate , its normalized rank is calculated as where , and is the length of full ranking list. Thus, , and a top ranked will have a small . The attack effectiveness can be measured by the magnitude of change in .

Performance of Attack. To measure the performance of a single CA attack, we average the rank of candidate across every query , i.e., . Similarly, the performance of a single QA attack can be measured by the average rank across every candidate , i.e., . For the overall performance of an attack, we conduct times of independent attacks and report the mean of or , accordingly.

CA+ & QA+. For CA+, the query set is randomly sampled from . Likewise, for QA+, the candidate set is from . Without attack, both the and will approximate to , and the attacks should significantly decrease the value.

CA- & QA-. In practice, the for CA- and the for QA- cannot be randomly sampled, because the two attacks are often to lower some top ranked candidates. Thus, the two sets should be selected from the top ranked samples (top- in our experiments) in . Formally, given the candidate for CA-, we randomly sample the queries from as . Given the query for QA-, candidates are randomly sampled from as . Without attack, both the and will be close to , and the attacks should significantly increase the value.

Hyper-Parameters. We conduct CA with queries, and QA with candidates, respectively. In QA, we let . The SP balancing parameter is set to for QA+ , and for QA-. In addition, We investigate attacks of different strength , e.g. , where is the strongest attack. The PGD step size is empirically set to , and the number of PGD iterations to . We perform times of attack to obtain the reported performance.

Adversarial Defense. Ranking models are trained using Eq. (13) with the strongest adversary following the procedure of Madry defense [50].

4.1 MNIST Dataset

CA+ CA- QA+ QA-
(CTD) Cosine Distance, Triplet Loss, Defensive (R@1=98.3%)
0 50 50 50 50 2.0 2.0 2.0 2.0 50 50 50 50 0.5 0.5 0.5 0.5
0.01 48.9 49.3 49.4 49.5 2.2 2.2 2.2 2.1 49.9 49.5 49.5 49.7 0.5 0.5 0.5 0.5
0.03 47.4 48.4 48.6 48.9 2.5 2.5 2.4 2.4 48.0 48.5 49.2 49.5 0.6 0.6 0.5 0.5
0.1 42.4 44.2 45.9 46.7 3.8 3.6 3.5 3.4 43.2 45.0 47.4 48.2 1.0 0.8 0.7 0.7
0.3 30.7 34.5 38.7 40.7 7.0 6.7 6.5 6.5 33.2 37.2 42.3 45.1 2.4 1.9 1.6 1.5
Table 2: Adversarial ranking defense with MNIST. Applying QA+ attacks on model, the ranks of candidates in are kept no larger than , respectively, regardless of . With the QA- counterpart, the ranks of are kept no larger than , respectively, regardless of .

Following conventional settings with the MNIST [38] dataset, we train a CNN ranking model comprising convolutional layers and fully-connected layer. This CNN architecture (denoted as C2F1) is identical to the one used in [50] except for the removal of the last fully-connected layer. Specifically, the ranking model is trained with cosine distance and triplet loss. The retrieval performance of the model is Recall@1= (R@), as shown in Tab. 1 in grey highlight.

Attacking results against this vanilla model (i.e., the ranking model which is not enhanced with our defense method) are presented in Tab. 1. For example, a strong CA+ attack (i.e., ) for can raise the rank from to . Likewise, the rank of can be raised to , , for chosen queries, respectively.

On the other hand, a strong CA- attack for can lower the rank from to . The results of strong CA- attacks for are similar to the case.

The results of QA+ and QA- are also shown in Tab. 1. the rank changes with QA attacks are less dramatic (but still significant) than CA. This is due to the additional difficulty introduced by SP term in Eq. (12), and the QA attack effectiveness is inversely correlated with . For instance, a strong QA- for can only lower the rank from to , but the attacking effect can be further boosted by decreasing . More experimental results are presented in following discussion. In brief, our proposed attacks against the vanilla ranking model is effective.

Next, we evaluate the performance of our defense method. Our defense should be able to enhance the robustness of a ranking model, which can be measured by the difference between the attack effectiveness with our defense and the attack effectiveness without our defense. As a common phenomenon of adversarial training, our defense mechanism leads to a slight retrieval performance degradation for unperturbed input (highlighted in blue in Tab. 2), but the attacking effectiveness is clearly mitigated by our defense. For instance, the same strong CA+ attack for on the defensive model (i.e., the ranking model which is enhanced by our defense method) can only raise the rank from to , compared to its vanilla counterpart raising to . Further analysis suggests that the weights in the first convolution layer of the defensive model are closer to

and have smaller variance than those of the vanilla model, which may help resist adversarial perturbation from changing the layer outputs into the local linear area of ReLU 

[21].

To visualize the effect of our attacks and defense, we track the attacking effect with varying from to on the vanilla and defensive models, as shown in Fig. 2. It is noted that our defense could significantly suppress the maximum embedding shift distance incurred by adversarial perturbation to nearly , but the defensive model is still not completely immune to attacks. We speculate the defensive model still has “blind spots” [75] in some local areas that could be exploited by the attacks.

Figure 2: Comparison of Attacks on vanilla and defensive models. Apart from the ranks of chosen candidates, We also measure the maximum shift distance of embedding vectors that adversarial perturbation could incur.

In summary, these results and further experiments (see supplementary material) suggest that: (1) deep ranking models are vulnerable to adversarial ranking attacks, no matter what loss function or distance metric is selected; (2) vanilla models trained with contrastive loss are more robust than those trained with triplet loss. This is possibly due to contrastive loss explicitly reducing the intra-class embedding variation. Additionally, our defense method could consistently improve the robustness of all these models; (3) different distance metrics have almost negligible contribution on robustness. Specifically, Euclidean distance-based models are slightly more susceptible to weak (e.g., ) attacks; (4) Euclidean distance-based models are harder to defend than cosine distance-based ones. Beyond these experiments, we also find that the margin hyper-parameter of triplet loss and the dimensionality of the embedding space have marginal influences on model robustness.

4.2 Fashion-MNIST Dataset

CA+ CA- QA+ QA-
(CT) Cosine Distance, Triplet Loss (R@1=88.8%)
0 50 50 50 50 1.9 1.9 1.9 1.9 50 50 50 50 0.5 0.5 0.5 0.5
0.01 36.6 39.9 43.2 44.8 5.6 5.1 4.9 4.8 39.4 42.0 45.3 47.1 2.1 1.6 1.2 1.1
0.03 19.7 25.4 31.7 35.6 15.5 14.8 14.4 14.3 21.7 28.2 35.7 40.6 5.6 4.1 3.3 2.9
0.1 3.7 10.5 17.3 22.7 87.2 86.7 86.3 86.3 7.1 12.4 23.6 32.5 10.9 8.3 6.7 6.0
0.3 1.3 9.4 16.0 21.5 100.0 100.0 100.0 100.0 6.3 10.8 21.8 31.7 12.6 9.4 7.5 6.6
(CTD) Cosine Distance, Triplet Loss, Defensive (R@1=79.6%)
0 50 50 50 50 1.2 1.2 1.2 1.2 50 50 50 50 0.5 0.5 0.5 0.5
0.01 48.9 48.9 49.3 49.3 1.4 1.4 1.4 1.4 49.4 49.9 49.9 50.0 0.5 0.5 0.5 0.5
0.03 47.1 47.9 48.3 48.3 2.0 1.9 1.8 1.8 48.3 49.1 49.5 49.8 0.7 0.6 0.6 0.6
0.1 42.4 43.5 44.5 44.8 4.6 4.2 4.0 3.9 45.4 47.2 48.7 49.2 1.4 1.2 1.1 1.1
0.3 32.5 35.4 37.5 38.2 11.2 10.5 10.1 10.0 39.3 42.6 46.5 47.8 3.9 3.3 3.0 2.9
Table 3: Adversarial ranking attack and defense on Fashion-MNIST. The lowest ranks of are in QA+, and for QA+, respectively.

Fashion-MNIST [81] is an MNIST-like but more difficult dataset, comprising training examples and test samples. The samples are greyscale images covering different fashion product classes, including “T-shirt” and “dress”, etc. We train the vanilla and defensive models based on the cosine distance and triplet loss and conduct attack experiments.

The attack and defense results are available in Tab. 3. From the table, we note that our attacks could achieve better effect compared to experiments on MNIST. For example, in a strong CA+ for , the rank can be raised to . On the other hand, despite the moderate improvement in robustness, the defensive model performs worse in unperturbed sample retrieval, as expected. The performance degradation is more pronounced on this dataset compared to MNIST. We speculate the differences are related to the increased dataset difficulty.

4.3 Stanford Online Products Dataset

Stanford Online Products (SOP) dataset [57] contains k images of k classes of real online products from eBay for metric learning. We use the same dataset split as used in the original work [57]. We also train the same vanilla ranking model using the same triplet ranking loss function with Euclidean distance, except that the GoogLeNet [73] is replaced with ResNet-18 [25]. The ResNet-18 achieves better retrieval performance.

Attack and defense results on SOP are present in Tab. 4. It is noted that our attacks are quite effective on this difficult large-scale dataset, as merely perturbation () to any candidate image could make it ranked ahead or behind of nearly all the rest candidates (as shown by the CA+ and CA- results with ). The QA on this dataset is significantly effective as well. On the other hand, our defense method leads to drastically decreased retrieval performance, i.e. R@1 from to , which is expected on such a difficult dataset. Meanwhile, our defense could moderately improve the model robustness against relatively weaker adversarial examples (e.g. ), but improving model robustness on this dataset is more difficult, compared to experiments on other datasets.

By comparing the results among all the three datasets, we find that ranking models trained on simpler datasets are less prone to be attacked, and are easier to defend.On the contrary, models trained on harder datasets are more susceptible to adversarial ranking attack, and are more difficult to defend. Therefore, we speculate that models used in realistic applications could be easier to attack, because they are usually trained on larger-scale and more difficult datasets.

CA+ CA- QA+ QA-
(ET) Euclidean Distance, Triplet Loss (R@1=63.1%)
0 50 50 50 50 1.9 1.9 1.9 1.9 50 50 50 50 0.5 0.5 0.5 0.5
0.01 0.0 0.8 2.0 2.6 99.7 99.6 99.4 99.3 4.8 7.0 16.3 25.8 54.9 40.2 27.1 21.9
0.03 0.0 0.3 1.0 1.5 100.0 100.0 100.0 100.0 1.6 3.3 10.0 19.2 68.1 52.4 36.6 30.1
0.1 0.0 0.3 1.2 1.7 100.0 100.0 100.0 100.0 1.6 3.4 9.9 18.8 69.8 53.7 37.1 30.0
0.3 0.0 0.2 0.8 1.3 100.0 100.0 100.0 100.0 0.8 2.0 7.3 15.6 77.7 61.7 43.5 35.0
(ETD) Euclidean Distance, Triplet Loss, Defensive (R@1=40.2%)
0 50 50 50 50 2.0 2.0 2.0 2.0 50 50 50 50 0.5 0.5 0.5 0.5
0.01 6.5 12.6 17.9 19.6 81.3 79.2 77.2 76.4 14.1 22.8 34.5 40.5 31.2 22.2 15.5 13.0
0.03 0.4 5.0 10.5 12.8 98.7 98.4 98.0 97.9 7.3 13.6 26.3 34.9 42.2 29.8 19.9 16.1
0.1 0.0 6.2 12.8 15.1 100.0 100.0 99.9 99.9 7.0 12.0 23.7 32.9 52.9 40.2 29.0 24.5
0.3 0.0 4.8 10.8 13.2 100.0 100.0 100.0 100.0 5.0 9.2 20.1 29.9 59.8 46.2 33.2 27.6
Table 4: Adversarial ranking attack and defense on SOP. With different , the worst ranks of in QA+ are , and those for QA- are , respectively.

5 Discussions

In this section, we study the transferability of our adversarial ranking examples, and universal adversarial perturbation for ranking. Both of them illustrate the possibility of practical black-box attack. Additionally, we also perform parameter search on the balancing parameter for QA.

5.1 Adversarial Example Transferability

As demonstrated in previous experiments, deep ranking models can be compromised by our white-box attacks. In realistic scenarios, the white-box attacks are not practical enough because the model to be attacked is often unknown (i.e., the architecture and parameters are unknown).

On the other hand, adversarial examples for classification have been found transferable [60, 59] (i.e. model-agnostic) between different models with different network architectures. And the transferability has become the foundation of a class of existing black-box attacks. Specifically, for such a typical attack, adversarial examples are generated from a replacement model [60] using a white-box attack, and are directly used to attack the black-box model.

Adversarial ranking attack could be more practical if the adversarial ranking examples have the similar transferability. Besides the C2F1 model, we train two vanilla models on the MNIST dataset: (1) LeNet [38], which has lower model capacity compared to C2F1; (2) ResNet-18 [25] (denoted as Res18), which has a better network architecture and higher model capacity.

CA+ Transfer (Black Box),
FromTo LeNet C2F1 Res18
LeNet 5016.6 35.1 34.3
C2F1 28.6 502.1 31.3
Res18 24.4 27.0 502.2
CA- Transfer (Black Box),
FromTo LeNet C2F1 Res18
LeNet 2.563.7 2.110.0 2.19.1
C2F1 2.59.1 2.193.9 2.19.3
Res18 2.59.9 2.111.8 2.166.7
QA+ Transfer (Black Box),
FromTo LeNet C2F1 Res18
LeNet 5020.5 43.0 45.8
C2F1 43.5 506.3 45.4
Res18 41.4 40.4 5014.1
QA- Transfer (Black Box),
FromTo LeNet C2F1 Res18
LeNet 0.57.0 0.51.6 0.51.8
C2F1 0.51.0 0.58.6 0.51.9
Res18 0.50.8 0.51.2 0.56.9
Table 5: Transferability of adversarial ranking examples. Adversarial examples are generated from one model and directly used on another. We report the rank of the same with respect to the same across different models to illustrate the transfer attack effectiveness. Transferring adversarial examples to a model itself (the diagonal lines) is equivalent to white-box attack.

The results are present in Tab. 5. For example, in the CA+ transfer attack, we generate adversarial candidates from the C2F1 model and directly use them to attack the Res18 model (row 2, column 3, top-left table), and the ranks of the adversarial candidates with respect to the same query is still raised to . We also find the CA- transfer attack is effective, where the ranks of our adversarial candidates are lowered, e.g. from to (row 2, column 3, bottom-left table). Similar results can be observed on the QA transfer experiments, and they show weaker effect due to the SP term.

From these results, we find that: (1) CNN with better architecture and higher model capacity (i.e., Res18), is less susceptible to adversarial ranking attack. This conclusion is consistent with one of Madry’s [50], which claims that higher model capacity could help improve model robustness; (2) adversarial examples generated from the Res18 have the most significant effectiveness in transfer attack; (3) CNN of low model capacity (i.e., LeNet), performs moderately in terms of both adversarial example transferability and model robustness. We speculate its robustness stems from a forced regularization effect due low model capacity. Beyond these, we also noted adversarial ranking examples are transferable disregarding the difference in loss function or distance metric.

Apart from transferability across different architectures, we also investigated the transferability between the C2F1 models with different network parameters. Results suggest similar transferability between these models. Notably, when transferring adversarial examples to a defensive C2F1 model, the attacking effect is significantly mitigated. The result further demonstrates the effectiveness of our defense.

5.2 Universal Perturbation for Ranking

Recently, universal (i.e. image-agnostic) adversarial perturbation [53] for classification has been found possible, where a single perturbation may lead to misclassification when added to any image. Thus, we also investigate the existence of universal adversarial perturbation for ranking.

To this end, we follow [53] and formulate the image-agnostic CA+ (abbr. I-CA+). Given a set of candidates and a set of queries , I-CA+ is to find a single universal adversarial perturbation , so that the rank of every perturbed candidate with respect to can be raised. The corresponding optimization problem of I-CA+ is:

(14)

When applied with such universal perturbation, the rank of any candidate w.r.t is expected to be raised. The objective functions of I-CA-, I-QA+ and I-QA- can be obtained in similar way. Note, unlike [40] which aims to find universal perturbation that can make image retrieval system return irrelevant results, our universal perturbations have distinct purposes.

We conduct experiment on the MNIST dataset. For I-CA+ attack, we randomly sample of for generating the universal perturbation. Following [53], another non-overlapping examples are randomly sampled from to test whether the generated perturbation is generalizable on “unseen” (i.e., not used for generating the perturbation) images. Experiments for the other image-agnostic attacks are conducted similarly. Note, we only report the I-CA- and I-QA- effectiveness on the top ranked samples, similar to CA- and QA-.

CA+ CA- QA+ QA-
50 2.1 2.1 93.9 50 0.2 0.5 94.1
I-CA+ I-CA- I-QA+ I-QA-
50 18.1 0.6 9.5 50 20.5 2.1 7.6
I-CA+ (unseen) I-CA- (unseen) I-QA+ (unseen) I-QA- (unseen)
50 18.5 0.7 9.4 50 21.0 2.2 7.4
Table 6: Universal Adversarial Perturbation for Ranking on MNIST. Each pair of results presents the original rank of chosen candidates and that after adding adversarial perturbation. Both , are set to . Parameter is set to to reduce attack difficulty.

As shown in Tab. 6, our I-CA can raise the ranks of to , or lower them to . When added to “unseen“ candidate images, our universal perturbation could retain nearly the same effectiveness. This may due to low intra-class variance of the MNIST dataset.

5.3 Semantics Preserving for QA

As discussed previously, the Query Attack (QA) may drastically change the semantics of the query . To alleviate this problem, the Semantics-Preserving (SP) term is added to the naive QA to help preserve the query semantics. Predictably, it is more difficult to perform QA with a large , as the ranks of are almost not allowed to be changed.

To investigate the actual influence of the balancing parameter , we provide parameter search on it with MNIST dataset. In particular, We set to , and compare their results. Note that when , the QA becomes naive QA as the SP term is eliminated. With a strong SP constant, e.g. , the semantics of the chosen query is almost not allowed to be changed, hence result in extreme difficulty of attack.

As shown in Tab. 7, setting to could greatly boost the attacking effect, but consequently the ranks of will be drastically changed. In contrast, when is set to the excessive value for a perfectly stealth QA, the attack can still raise the rank of chosen candidate from to in QA+ with , or lower the rank of chosen candidate from to in QA- with . During these attacks, the ranks of are kept within despite of the extreme difficulty. It means the query semantics can be preserved. In practice, we empirically set the parameter as for QA+, or as for QA- for the balance between attack effectiveness and preserving query semantics.

QA+ QA-
m=1 2 5 10 m=1 2 5 10
(CT) Cosine distance, Triplet loss
0 0.2, 33.6 6.3, 23.7 18.5, 26.5 29.6, 25.7 94.1, 89.4 93.2, 90.3 92.6, 90.9 92.3, 91.2
6.3, 3.6 11.2, 5.7 22.5, 7.7 32.1, 7.7 55.5, 35.6 52.4, 37.6 50.2, 39.3 49.4, 40.0
14.1, 0.6 20.8, 0.7 31.2, 0.7 38.1, 0.7 8.6, 1.6 6.6, 1.6 5.3, 1.5 4.8, 1.5
37.9, 0.1 42.6, 0.1 46.3, 0.1 47.8, 0.1 1.9, 0.1 1.4, 0.1 1.2, 0.1 1.1, 0.1
Table 7: Parameter search on Semantics-Preserving balancing parameter with MNIST. We report two mean ranks in each cell: one for the chosen candidates ; another for used for SP.

6 Conclusion

Deep ranking models are vulnerable to adversarial perturbations that could intentionally change the ranking result. In this paper, we define and implement adversarial ranking attack that can compromise deep ranking models. We also propose an adversarial ranking defense that can significantly suppress embedding shift distance and moderately improve the ranking model robustness. Moreover, the transferability of our adversarial examples and the existence of universal adversarial perturbations for ranking attack illustrate the possibility of practical black-box attack and potential risk of realistic ranking applications.

In the potential of future work, we may explore (1) better ranking loss functions and defenses; (2) better black-box attacks and more transferable adversarial examples.

References