Black-Box Attacks on Sequential Recommenders via Data-Free Model Extraction

We investigate whether model extraction can be used to "steal" the weights of sequential recommender systems, and the potential threats posed to victims of such attacks. This type of risk has attracted attention in image and text classification, but to our knowledge not in recommender systems. We argue that sequential recommender systems are subject to unique vulnerabilities due to the specific autoregressive regimes used to train them. Unlike many existing recommender attackers, which assume the dataset used to train the victim model is exposed to attackers, we consider a data-free setting, where training data are not accessible. Under this setting, we propose an API-based model extraction method via limited-budget synthetic data generation and knowledge distillation. We investigate state-of-the-art models for sequential recommendation and show their vulnerability under model extraction and downstream attacks. We perform attacks in two stages. (1) Model extraction: given different types of synthetic data and their labels retrieved from a black-box recommender, we extract the black-box model to a white-box model via distillation. (2) Downstream attacks: we attack the black-box model with adversarial samples generated by the white-box recommender. Experiments show the effectiveness of our data-free model extraction and downstream attacks on sequential recommenders in both profile pollution and data poisoning settings.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 10

08/12/2020

Model Robustness with Text Classification: Semantic-preserving adversarial attacks

We propose algorithms to create adversarial attacks to assess model robu...
05/06/2020

MAZE: Data-Free Model Stealing Attack Using Zeroth-Order Gradient Estimation

Model Stealing (MS) attacks allow an adversary with black-box access to ...
02/25/2022

On the Effectiveness of Dataset Watermarking in Adversarial Settings

In a data-driven world, datasets constitute a significant economic value...
08/09/2020

Partially Synthetic Data for Recommender Systems: Prediction Performance and Preference Hiding

This paper demonstrates the potential of statistical disclosure control ...
08/29/2021

Beyond Model Extraction: Imitation Attack for Black-Box NLP APIs

Machine-learning-as-a-service (MLaaS) has attracted millions of users to...
11/30/2020

Data-Free Model Extraction

Current model extraction attacks assume that the adversary has access to...
06/19/2020

Feature Interaction Interpretability: A Case for Explaining Ad-Recommendation Systems via Neural Interaction Detection

Recommendation is a prevalent application of machine learning that affec...

Code Repositories

RecSys-Extraction-Attack

PyTorch Implementation of Black-Box Attacks on Sequential Recommenders via Data-Free Model Extraction


view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Model extraction attacks (Lowd and Meek, 2005; Tramèr et al., 2016)

try to make a local copy of a machine learning model given only access to a query API. Model extraction exposes issues such as sensitive training information leakage 

(Tramèr et al., 2016) and adversarial example attacks (Papernot et al., 2017). Recently, this topic has attracted attention in image classification (Orekondy et al., 2019; Papernot et al., 2017; Zhou et al., 2020; Kariyappa et al., 2020) and text classification (Pal et al., 2019; Krishna et al., 2020). In this work, we show that model extraction attacks also pose a threat to sequential recommender systems.

Sequential models are a popular framework for personalized recommendation by capturing users’ evolving interests and item-to-item transition patterns. In recent years, various neural-network-based models, such as RNN and CNN frameworks (e.g. GRU4Rec

(Hidasi et al., 2015), Caser(Tang and Wang, 2018), NARM(Li et al., 2017)) and Transformer frameworks (e.g. SASRec(Kang and McAuley, 2018), BERT4Rec(Sun et al., 2019)) are widely used and consistently outperform non-sequential (Rendle et al., 2012; He et al., 2017) as well as traditional sequential models (Rendle et al., 2010; He and McAuley, 2016)

. However, only a few works have studied attacks on recommenders, and have certain limitations: (1) Few attack methods are tailored to sequential models. Attacks via adversarial machine learning have achieved the state-of-the-art in general recommendation settings 

(Christakopoulou and Banerjee, 2019; Fang et al., 2020; Tang et al., 2020), but experiments are conducted on matrix-factorization models and are hard to directly apply to sequential recommendation; though some model-agnostic attacks (Lam and Riedl, 2004; Burke et al., 2005)

can be used in sequential settings, they heavily depend on heuristics and their effectiveness is often limited; (2) Many attack methods assume that full training data for the victim model is exposed to attackers 

(Zhang et al., 2020; Christakopoulou and Banerjee, 2019; Fang et al., 2020; Tang et al., 2020; Li et al., 2016). Such data can be used to train surrogate local models by attackers. However this setting is quite restrictive (or unrealistic), especially in implicit feedback settings (e.g. clicks, views), where data would be very difficult to obtain by an attacker.

We consider a data-free setting, where no original training data is available to train a surrogate model. That is, we build a surrogate model without real training data but limited API queries. We first construct downstream attacks against our surrogate (white-box) sequential recommender, then transfer the attacks to the victim (black-box) recommender.

Figure 1. We illustrate two adversarial attack scenarios against sequential recommenders via model extraction.

Model extraction

on sequential recommenders poses several challenges: (1) no access to the orignial training dataset; (2) unlike image or text tasks, we cannot directly use surrogate datasets with semantic similarities; (3) APIs generally only provide rankings (rather than e.g. probabilities) and the query budget can be limited. Considering these challenges, sequential recommenders may seem relatively safe. However, noticing sequential recommenders are often trained in an autoregressive way (i.e., predicting the next event in a sequence based on previous ones), our method shows the recommender itself can be used to generate sequential data which is similar to training data from the ‘real’ data distribution. With this property and a sampling strategy: (1) ‘fake’ training data can be constructed that renders sequential recommenders vulnerable to model extraction; (2) the ‘fake’ data from a limited number of API queries can resemble normal user behavior, which is difficult to detect.

Downstream attacks are performed given the extracted surrogate model (see Figure 1). But attack methods tailored to sequential recommenders are scarce (Zhang et al., 2020). In this work, we propose two attack methods with adversarial example techniques to current sequential recommenders, including profile pollution attacks (that operate by ‘appending’ items to users’ logs) and data poisoning attacks (where ‘fake’ users are generated to bias the retrained model). We extensively evaluate the effectiveness of our strategies in the setting that a black-box sequential model returns top-k ranked lists.

2. Related Work

2.1. Model Extraction in Image and Text Tasks

Model extraction attacks are proposed in (Lowd and Meek, 2005; Tramèr et al., 2016) by ‘stealing’ model weights to make a local model copy (Orekondy et al., 2019; Papernot et al., 2017; Zhou et al., 2020; Kariyappa et al., 2020; Pal et al., 2019; Krishna et al., 2020). Prior works are often concerned with image classification. To extract the target model weights, JBDA (Orekondy et al., 2019) and KnockoffNets (Papernot et al., 2017) assume the attackers have access to partial training data or a surrogate dataset with semantic similarities. Recently, methods have been proposed in data-free settings. DaST (Zhou et al., 2020)

adopted multi-branch Generative Adversarial Networks 

(Goodfellow et al., 2014b) to generate synthetic samples, which are subsequently labeled by the target model. MAZE (Kariyappa et al., 2020) generated inputs that maximize the disagreement between the attacker and the target model. MAZE

used zeroth-order gradient estimation to optimize the generator module for accurate attacks. Because of the discrete nature of the input space, the methods above cannot transfer to sequential data directly. For Natural Language Processing (NLP) systems,

THIEVES (Krishna et al., 2020) studied model extraction attacks on BERT-based APIs (Devlin et al., 2018). Even though attacks against BERT-based models exist for NLP, where authors find random

word sequences and a surrogate dataset (e.g. WikiText-103 

(Merity et al., 2016)) that can both create effective queries and retrieve labels to approximate the target model, we found that (1) in recommendation, it is hard to use surrogate datasets with semantic similarities (as is common in NLP); we also adopt random item sequences as a baseline but their model extraction performance is limited. Therefore we generate data following the autoregressive property of sequential recommenders; (2) Compared with NLP, it is harder to distill ‘ranking’ (instead of classification) in recommendation. We design a pair-wise ranking loss to tackle the challenge; (3) downstream attacks after model extraction are under-explored, especially in recommendation, so our work also contributes in this regard.

2.2. Attacks on Recommender Systems

Existing works (Yang et al., 2017; Huang et al., 2021) categorize attacks on recommender systems as profile pollution attacks and data poisoning attacks, affecting a recommender system in test and training phases respectively.

Profile pollution attacks aim to pollute a target user’s profile (such as their view history) to manipulate the specific user’s recommendation results. For example, (Xing et al., 2013) uses a cross-site request forgery (CSRF) technique (Zeller and Felten, 2008) to inject ‘fake’ user views into target user logs in real-world websites including YouTube, Amazon and Google Search. However the strategy used in (Xing et al., 2013) to decide which ‘fake’ item should be injected is a simple and heuristic without the knowledge of victim recommenders. In our work, given that we can extract victim recommender weights, more effective attacks can be investigated, such as evasion attacks in general machine learning (Goodfellow et al., 2014a; Kurakin et al., 2016; Papernot et al., 2017). Note that in our work, we assume we can append items to target user logs with injection attacks via implanted malware (Ren et al., 2015; Lee et al., 2017), where item interactions could be added on users’ behalf, therefore, we focus on injecting algorithm design and attack transferabilities (exact approaches for malware development and activity injection are cybersecurity tasks and beyond our research scope).

Data poisoning attacks (a.k.a. Shilling attacks (Lam and Riedl, 2004; Gunes et al., 2014)) generate ratings from a number of fake users to poison training data. Some poisoning methods are recommender-agnostic (Lam and Riedl, 2004; Burke et al., 2005) (i.e., they do not consider the characteristics of the model architecture) but they heavily depend on heuristics and often limit their effectiveness. Meanwhile, poisoning attacks have been proposed for specific recommender architectures. For example, (Li et al., 2016; Christakopoulou and Banerjee, 2019; Tang et al., 2020) propose poisoning algorithms for matrix-factorization recommenders and (Yang et al., 2017)

poisons co-visitation-based recommenders. Recently, as deep learning has been widely applied to recommendation, 

(Huang et al., 2021) investigate data poisoning in the neural collaborative filtering framework (NCF) (He et al., 2017). The closest works to ours are perhaps LOKI (Zhang et al., 2020), because of the sequential setting and (Christakopoulou and Banerjee, 2019; Fang et al., 2020; Tang et al., 2020) which adopt adversarial machine learning techniques to generate ‘fake’ user profiles for matrix-factorization models. One (fairly restrictive) limitation of LOKI is the assumption that the attacker can access complete interaction data used for training. Another limitation is that LOKI is infeasible against deep networks (e.g. RNN / transformer), due to an unaffordable (even with tricks) Hessian computation. In our work, black-box recommender weights are extracted for attacks without any real training data; we further design approximated and effective attacks to these recommenders.

3. Framework

Our framework has two stages: (1) Model extraction: we generate informative synthetic data to train our white-box recommender that can rapidly close the gap between the victim recommender and ours via knowledge distillation (Hinton et al., 2015); (2) Downstream attacks: we propose gradient-based adversarial sample generation algorithms, which allows us to find effective adversarial sequences in the discrete item space from the white-box recommender and achieve successful profile pollution or data poisoning attacks against the victim recommender.

3.1. Setting

To focus on model extraction and attacks against black-box sequential recommender systems, we first introduce some details regarding our setting. We formalize the problem with the following settings to define the research scope:

  • Unknown Weights: Weights or metrics of the victim recommender are not provided.

  • Data-Free: Original training data is not available, and item statistics (e.g. popularity) are not accessible.

  • Limited API Queries: Given some input data, the victim model API provides a ranked list of items (e.g. top 100 recommended items). To avoid large numbers of API requests, we define budgets for the total number of victim model queries. Here, we treat each input sequence as one budget unit.

  • (Partially) Known Architecture: Although weights are confidential, model architectures are known (e.g. we know the victim recommender is a transformer-based model). We also relax this assumption to cases where the white-box recommender uses a different sequential model architecture from the victim recommender.

3.2. Threat Model

3.2.1. Black-box Victim Recommender

Formally, we first denote as the discrete item space with elements. Given a sequence of length ordered by timestamp, i.e.,  where , a victim sequential recommender is a black-box, i.e., the weights are unknown. should return a truncated ranked list over the next possible item in the item space , i.e., , where is the truncated ranked list for the top-k recommended items. Many platforms surface ranked lists in such a form.

3.2.2. White-box Surrogate Recommender

We construct the white-box model with two components: an embedding layer and a sequential model such that . Although the black-box model only returns a list of recommended items, the output scores of white-box model over the input space can now be accessed, i.e., .

3.3. Attack Goal

3.3.1. Model Extraction

As motivated previously, the first step is to extract a white-box model from a trained, black-box victim model . Without accessing the weights of , we obtain information from the black-box model by making limited queries and saving the predicted ranked list from each query. In other words, we want to minimize the distance between the black-box and white-box model on query results. We are able to achieve the goal of learning a white-box model via knowledge distillation (Hinton et al., 2015). Mathematically, the model extraction process can be formulated as an optimization problem:

(1)

where represents a set of sequences and with unique identifier . We define as a set of top-k predicted ranked lists, where is the black-box model output for . Note that data are not real training data. Instead, they are generated with specific strategies, whose details will be included in Section 4.

is a loss function measuring the distance between two model outputs, such as a ranking loss.

3.3.2. Downstream Attacks

We use the extracted white-box model as the surrogate of the black-box model to construct attacks. In this work, we investigate targeted promotion attacks, whose goal is to increase the target item exposure to users as much as possible, which is a common attack scenario (Tang et al., 2020). Note that targeted demotion attacks can also be constructed with similar techniques. Formally, the objective of targeted promotion attacks is:

  • Profile Pollution Attack: We define profile pollution attacks formally as the problem of finding the optimum injection items (that should be items appended after the original sequence ) that maximize the target item exposure , which can be characterized with common ranking measures like Recall or NDCG (Kang and McAuley, 2018; Sun et al., 2019; Li et al., 2017):

    (2)

    where refers to the concatenation of the sequence and attacking items . Note that in profile pollution attacks setting, no retraining needed and this user-specific profile is assumed to be accessed and can be injected (e.g. using malwares (Ren et al., 2015; Lee et al., 2017); see Section 2.2).

  • Data Poisoning Attack: Similarly, poisoning attacks can be viewed as finding biased injection profiles , such that after retraining, the recommender propagates the bias and is more likely to recommend the target. refers to the injection of fake profiles into normal training data and is the retrained recommender with recommender training loss function as:

    (3)

4. Methodology

4.1. Data-Free Model Extraction

To extract a black-box recommender in a data-free setting, we complete the process in two steps: (1) data generation, which generates input sequences and output ranking lists ; (2) model distillation, using (, ) to minimize the difference between the black- and white-box recommender.

4.1.1. Data Generation.

Considering that we don’t have access to the original training data and item statistics, a trivial solution is to make use of random data and acquire the model recommendations for later stages.

  • Random: Items are uniformly sampled from the input space to form sequences where is a generated sequence with identifier and is the budget size. Top-k items ( in our experiments) are acquired from the victim recommender in each step to form the output result set, i.e., , where the operation truncates the first items in the sequence (corresponding to a recommendation list after each click). is the length of , where can be sampled from a pre-defined distribution or simply set as a fixed value. Following this strategy, we generate inputs and labels for model distillation

However, random data cannot simulate real user behavior sequences

where sequential dependecies exist among different steps. To tackle this problem, we propose an autoregressive generation strategy. Inspired by autoregressive language models where generated sentences are similar to a ‘real’ data distribution, and the finding that sequential recommenders are often trained in an autoregressive way 

(Hidasi et al., 2015; Li et al., 2017; Kang and McAuley, 2018), we generate fake sequences autoregressively.

  • Autoregressive: To generate one sequence , a random item is sampled as the start item (the operation selects the -th item in sequence ) and fed to a sequential recommender to get a recommendation list . We repeat this step autoregressively i.e.,  to generate sequences up to the maximum length . Here sampler is a method to sample one item from the given top-k list. In our experiment, sampling from the top-k items with monotonically decreasing probability performs similarly to uniform sampling, so we favor sampling when selecting the next item from the top-k list. Accordingly, we generate sequences and record top-k lists, forming a dataset for model distillation.

(a) Autoregressive Data Generation
(b) Model Extraction via Distillation
Figure 2. Autoregressive sequences and the use of synthetic data for model extraction.

For autoregressive sequence generation, Figure 1(a) visually represents the process of accumulating data by repeatedly feeding current data to the recommender and appending current sequences with the sampled items from the model output. Autoregressive method is beneficial as: (1) Generated data is more diverse and more representative of real user behavior. In particular, sampling instead of choosing the first recommended item will help to build more diversified and ‘user-like’ distillation data; (2) Since API queries are limited, autoregressive method generates data by resembling real data distribution can obtain higher-quality data to train a surrogate model effectively; (3) It resembles the behavior of real users so is hard to detect. Nevertheless, autoregressive method does not exploit output rankings and item properties like similarity due to restricted data access, which could limit the full utilization of a sequential recommender.

4.1.2. Model Distillation.

We use model distillation (Hinton et al., 2015) to minimize the difference between and by training with generated data (see Figure 1(b)). To get the most out of a single sequence during distillation, we generate sub-sequences and labels to enrich training data; for an input sequence from , we split it into entries of following training strategies in (Hidasi et al., 2015; Li et al., 2017).

Compared to traditional model distillation (Hinton et al., 2015; Krishna et al., 2020), where a model is distilled from predicted label probabilities

, in our setting we only have top-k ranking list instead of probability distributions. So we propose a method to distill the model with a ranking loss. We can access the white-box model

output scores to items from the black-box top-k list , which is defined as . For example, for , ). We also sample negative items uniformly and retrieve their scores as . We design a pair-wise ranking loss to measure the distance between black-box and white-box outputs:

(4)

The loss function consists of two terms. The first term emphasizes ranking by computing a marginal ranking loss between all neighboring item pairs in , which maximizes the probability of ranking positive items in the same order as the black-box recommender system. The second term punishes negative samples when they have higher scores than the top-k items, such that the distilled model learns to ‘recall’ similar top-k groups for next-item recommendation. and

are two margin values to be empirically set as hyperparameters.

4.2. Downstream Attacks

To investigate whether attacks can be transfered from the trained white-box model 111 We directly use to represent the trained white-box below. to the black-box model , we introduce two model attacks against sequential recommender systems: profile pollution attacks and data poisoning attacks, see Figure 1 for illustration of the two attack scenarios.

(a) Profile pollution attack with white-box model
(b) Data poisoning attack via adversarial co-visitation
Figure 3. Two scenarios: profile pollution attack and data poisoning attacks.

4.2.1. Profile Pollution Attack

As described in Figure 2(a), we perform profile pollution attacks to promote item exposure and use Algorithm 1 to construct the manipulated sequence in the input space. Because we can access the gradients in white box models, we are able to append ‘adversarial’ items by extending adversarial example techniques (e.g. Targeted Fast Gradient Sign Method (T-FGSM) (Goodfellow et al., 2014a)) from continuous feature space (e.g. image pixel values) to discrete item space (e.g. item IDs), with an assumption that the optimal item is ‘close’ to target item in embedding space. Therefore we can achieve user-specific item promotion without the black-box recommender being retrained as below.

Step 1: Compute Gradients at the Embedding Level. Given the user history , we construct the corrupted sequence by appending adversarial items after . We first initialize to be the same as and append target items to it. With from the previous step, we feed the embedded input to the model and compute backward gradients w.r.t. the input embeddings using a cross-entropy loss, where the target item is used as a label: .

1 Input sequence , target , expected length of polluted sequence , white-box model (i.e. , ), and ;
2 Output polluted sequence ;
3 initialize with ;
4 while length of  do
5       extend by appending target item: ;
6       transform sequence into sequence embeddings: ;
7       compute backward gradients using cross entropy ;
8   

    compute cosine similarity scores

between and ;
9       select candidates with highest scores, exclude if repeated;
10       replace in with and keep with highest ;
11      
12 end while
Algorithm 1 Adversarial Item Search for Profile Pollution

Step 2: Search for Adversarial Candidates. Based on and from the previous step, we first perform T-FGSM to compute the perturbed embeddings , then the cosine similarity of the embedded injection item in is computed with all item embeddings across . We select adversarial candidates with the highest cosine similarity. These items are tested with such that the candidate leading to the highest ranking score of target is kept as the adversarial item. It can be repeated for multiple injection items for better attack performance, and to avoid disproportionate target items in injection, we require target items not appear continuously.

4.2.2. Data Poisoning

1 Input target , expected length , white-box model (i.e. , ), and ;
2 Output adversarial profile ;
3 initialize with ;
4 while length of  do
5       sample item in and append to : ;
6       transform sequence into sequence embeddings: ;
7       compute backward gradients using cross entropy ;
8       compute cosine similarity scores between and ;
9       select candidates with lowest scores, exclude if repeated;
10       sample item in and replace in ;
11       append target item ;
12      
13 end while
Algorithm 2 Adversarial Profile Generation for Data Poisoning

Data poisoning attacks operate via fake profile injection, to promote target item exposure as much as possible (after retraining with the fake and normal profiles). We propose a simple adversarial strategy (visualized in Figure 2(b)) to generate poisoning data with white-box model . The intuition behind our poisoning data generation is that the next item should be the target even given sequences of seemingly irrelevant items.

In this case, we follow co-visitation approach (Yang et al., 2017; Song et al., 2020) and apply adversarial example techniques (Goodfellow et al., 2014a) to generate poisoning data. (1) We consider one poisoning sequence using alternating items pairs (e.g. [target, , target, , target, …]); (2) We try to find irrelevant/unlikely items to fill in . In detail, we use a similar approach to Algorithm 1, where the generation process first computes backward gradients similarly to the cross entropy loss and T-FGSM. However, in narrowing candidate items we choose from with the lowest similarity scores to (instead of the highest); (3) We repeat (1) and (2) to generate the poisoning data; details can be found in Algorithm 2.

Note that though the alternating pattern of co-visitation seems detectable, we can control the proportion of target items (apply noise or add ‘user-like’ generated data) to avoid this rigid pattern and make it less noticeable. Gradient information from the white-box model also empowers more tailored sequential-recommendation poisoning methods.

5. Experiments

5.1. Setup

5.1.1. Dataset

We use three popular recommendation datasets (see Table 2

) to evaluate our methods: Movielens-1M (ML-1M) 

(Harper and Konstan, 2015), Steam (McAuley et al., 2015) and Amazon Beauty (Ni et al., 2019). We follow the preprocessing in BERT4Rec (Sun et al., 2019) to process the rating data into implicit feedback. We follow SASRec (Kang and McAuley, 2018) and BERT4Rec (Sun et al., 2019) to hold out the last two items in each sequence for validation and testing, and the rest for training. We set black-box API returns to be top-100 recommended items.

Datasets Users Items Avg. len Max. len Sparsity
ML-1M 6,040 3,416 166 2277 95.16%
Steam 334,542 13,046 13 2045 99.90%
Beauty 40,226 54,542 9 293 99.98%
Table 1. Data Statistics
Model Basic Block Training Schema
NARM (Li et al., 2017) GRU Autoregressive
SASRec (Kang and McAuley, 2018) TRM Autoregressive
BERT4Rec (Sun et al., 2019) TRM Auto-encoding
Table 2. Sequential Model Architecture

5.1.2. Model.

To evaluate the performance of our attack, we implement a model extraction attack on three representative sequential recommenders, including our PyTorch implementations of NARM 

(Li et al., 2017), BERT4Rec (Sun et al., 2019) and SASRec (Kang and McAuley, 2018), with different basic blocks and training schemata, shown in Table 2.

  • NARM

    is an attentive recommender, containing an embedding layer, gated recurrent unit (GRU) 

    (Cho et al., 2014) as a global and local encoder, an attention module to compute session features and a similarity layer, which outputs the most similar items to the session features as recommendations (Li et al., 2017).

  • SASRec consists of an embedding layer that includes both the item embedding and the positional embedding of an input sequence as well as a stack of one-directional transformer (TRM) layers, where each transformer layer contains a multi-head self-attention module and a position-wise feed-forward network (Kang and McAuley, 2018).

  • BERT4Rec has an architecture similar to SASRec, but using a bidirectional transformer and an auto-encoding (masked language modeling) task for training (Sun et al., 2019).

5.1.3. Implementation Details.

Given a user sequence with length , we follow (Kang and McAuley, 2018; Sun et al., 2019) to use as training data and use the last two items for validation and testing respectively. We use hyper-parameters from grid-search and suggestions from original papers (Li et al., 2017; Kang and McAuley, 2018; Jin et al., 2019). For reproducibility, we summarize important training configurations in Table 3. Additionally, all models are trained using Adam (Kingma and Ba, 2014) optimizer with weight decay 0.01, learning rate 0.001, batch size 128 and 100 linear warmup steps. We follow (Kang and McAuley, 2018; Sun et al., 2019) to set allowed sequence lengths of ML-1M, Steam and Beauty as respectively, which are also applied as our generated sequence lengths. We follow (Tang et al., 2020) to use 1% of a real profile’s size as a poisoning profile size. Code and data are released222https://github.com/Yueeeeeeee/RecSys-Extraction-Attack.

Phase Model Config. on {ML-1M, Steam, Beauty} Phase Config. on {ML-1M, Steam, Beauty}
Black-box Training N GRU ly=1; dr={0.1, 0.2, 0.5} Model Extraction : {(0.75, 1.5), (0.5, 1.0), (0.5, 0.5)}
S TRM ly=2; h=2; dr={0.1, 0.2, 0.5} Profile Pollution Append items: {10, 2, 2}, = 1.0, n = 10
B TRM ly=2; h=2; dr={0.1, 0.2, 0.5}; mp={0.2, 0.2, 0.6} Data Poisoning Injected users: {60, 3345, 402}, = 1.0, n = 10
Table 3. Configurations. N:NARM, S:SASRec, B:Bert4Rec, ly:layer, h:attention head, dr:dropout rate, mp:masking probability.

5.1.4. Evaluation Protocol

We follow SASRec (Kang and McAuley, 2018) to accelerate evaluation by uniformly sampling 100 negative items for each user. Then we rank them with the positive item and report the average performance on these 101 testing items. Our Evaluation focuses on two aspects:

  • Ranking Performance: We to use truncated Recall@K that is equivalent to Hit Rate (HR@K) in our evaluation, and Normalized Discounted Cumulative Gain (NDCG@K) to measure the ranking quality following SASRec (Kang and McAuley, 2018) and BERT4Rec (Sun et al., 2019), higher is better.

  • Agreement Measure: We define Agreement@K (Agr@K) to evaluate the output similarity between the black-box model and our extracted white-box model:

    (5)

    where is the top-K predicted list from the black-box model and is from our white-box model. We report average Agr@K with to measure the output similarity.

Black-Box White-Box-Random White-Box-Autoregressive
(l)3-4 (l)5-8 (l)9-12 Dataset Option N@10 R@10 N@10 R@10 Agr@1 Agr@10 N@10 R@10 Agr@1 Agr@10
ML-1M NARM 0.625 0.820 0.598 0.809 0.389 0.605 0.615 0.812 0.571 0.747
SASRec 0.625 0.817 0.578 0.796 0.270 0.516 0.602 0.802 0.454 0.662
BERT4Rec 0.602 0.806 0.565 0.794 0.241 0.488 0.571 0.791 0.339 0.593
Steam NARM 0.628 0.848 0.625 0.849 0.679 0.642 0.601 0.806 0.743 0.722
SASRec 0.627 0.850 0.579 0.802 0.434 0.556 0.593 0.805 0.668 0.702
BERT4Rec 0.622 0.846 0.609 0.838 0.199 0.490 0.585 0.793 0.708 0.667
Beauty NARM 0.356 0.518 0.319 0.477 0.356 0.511 0.272 0.380 0.344 0.425
SASRec 0.344 0.494 0.304 0.459 0.251 0.213 0.347 0.505 0.343 0.357
BERT4Rec 0.349 0.515 0.200 0.352 0.026 0.043 0.300 0.454 0.178 0.291
Table 4. Extraction performance under identical model architecture and 5k budget, with Black-box original performance.
(a) Data Distribution for original and generated data.
(b) Cross model extraction results. Horizontal / vertical axes represent white-box / black-box model architectures. Darker colors represents larger values of Agr@10.
Figure 4. Data distributions and cross model extraction results.

5.2. RQ1: Can We Extract Model Weights without Real Data?

5.2.1. Standard Model Extraction.

We evaluate two ways (random and autoregressive as mentioned in Section 4.1.1) of generating synthetic data with labels for model extraction. In standard model extraction, we assume the model architecture is known, so that the white-box model uses the same architecture as the black-box model (e.g. SASRecSASRec) without real training data. We report results with a fixed budgets in Table 4.

Observations. From Table 4 we have a few observations: (1) Interestingly, without a training set, random and autoregressive can achieve similar ranking performance (N@10 and R@10) as the black-box. For example, compared with black-box NARM on ML-1M, R@10 for random drops 1.34% and autoregressive drops only 0.98%. On average, extracted recommenders’ R@10 is about 94.54% of the original. Particularly, random is trained on random data, but labels are retrieved from black-box model, reflecting correct last-click relations. Last clicks help random rank well, but agr@K is much poorer than autoregressive (see Table 4).

(2) Autoregressive has significant advantages in narrowing the distance between the two recommenders in all datasets, with an average Agr@10 of against from randomly generated data. (3) Figure 3(a) shows autoregressive resembles the true training data distribution much better than random, because autoregressive generates data following interactions with recommended items. Although sampling from the popularity distribution can also resemble the original data distribution, it breaks the assumption that we have no knowledge about the training set, and cannot capture similarities from sequential dependencies. (4) We also note that the datasets have large differences in the distillation process. For example, relatively dense datasets with many user interactions like ML-1M and Steam increase the probability of correct recommendations. Extracted recommenders based on such data distributions sustain a high level of similarity with respect to the black-box output distribution, while in sparser data, it could lead to problems like higher divergence and worse recommendation agreement, which we will examine in the next subsection. (5) Moreover, Table 4 indicates that NARM has the best overall capability of extracting black-box models, as NARM is able to recover most of the black-box models and both its similarity and recommendation metrics are among the highest. As for SASRec and BERT4Rec, both architectures show satisfactory extraction results on the ML-1M and Steam datasets, with SASRec showing slight improvements compared to BERT4Rec in most cases.

5.2.2. Cross Model Extraction.

Based on the analysis of different architectures, a natural question follows: which model would perform the best on a different black-box architecture? In this setting, we adopt the same budget and conduct cross-extraction experiments to find out how a white-box recommender would differ in terms of similarity when distilling a different black-box architecture. Cross-architecture model extraction is evaluated on these three datasets.

Observations. Results are visualized with heatmaps in Figure 3(b), where the horizontal / vertical axes represent white-box / black-box architectures. The NARM model performs the best overall as a white-box architecture, successfully reproducing most target recommender systems with an average of agreement in the top- recommendations, compared to of SASRec and of BERT4Rec.

5.3. RQ2: How do Dataset Sparsity and Budget Influence Model Extraction?

Table 5. Influence of data sparsity. Model extraction on -core Beauty
Table 6. Influence of query budgets on ML-1M (top), Steam (middle) and Beauty (bottom).

Data Sparsity. In the previous experiments, we notice that the sparsity of the original dataset on which the black-box is trained might influence the quality of our distilled model, suggested by the results on the Beauty dataset from Table 4. For the sparsity problem, we base a series of experiments on the most sparse dataset (Beauty); all three models are used to study whether model extraction performance deterioration is related to the dataset. We choose slightly different preprocessing techniques to build a -core Beauty dataset (i.e. densify interactions until item / user frequencies are both ). The processed -core datasets are denser with increasing . Our item candidate size (100 negatives) in evaluation does not change. Black-box models are trained on such processed data followed by autoregressive extraction experiments with a sequence budget in Table 5. As sparsity drops, compared to the 5-core Beauty data, both black-box and extracted models perform better, where the increasing Agr@10 indicates that the extracted models become more ‘similar’ to the black-box model. Our results indicate that data sparisty is an important factor for model extraction performance, where training on denser datasets usually leads to stronger extracted models.

Budget. We also want to find out how important the query budget is for distillation performance. In our framework we assume that the attacker can only query the black-box model a limited number of times (corresponding to a limited budget). Intuitively, a larger budget would induce a larger generated dataset for distillation and lead to a better distilled model with higher similarity and stronger recommendation scores. However, it is preferable to find an optimal budget such that the white-box system is ‘close enough’ to generate adversarial examples and form threats. Our experiments in Table 6 suggest that an increasing budget would lead to rapid initial improvements, resulting in a highly similar white-box model. Beyond a certain point, the gains are marginal.

5.4. RQ3: Can We Perform Profile Pollution Attacks using the Extracted Model?

Figure 5. Profile pollution attacks performance comparisons with different methods.

Setup. In profile pollution, we inject adversarial items after original user history and test the corrupted data on the black-box recommender. The average lengths of ML-1M, Steam and Beauty are 166, 13 and 9 respectively; based on this we generate adversarial items for ML-1M and items for Steam and Beauty. We perform profile pollution attacks on all users and present the average metrics in Figure 5. We avoid repeating targets in injection items to rule out trivial solutions like sequence of solely target items. Based on this setting, we introduce the following baseline methods: (1) RandAlter: alternating interactions with random items and target item(s) (Song et al., 2020); (2) Deep Q Learning (DQN): naive Q learning model with RNN-based architecture, the rank and number of target item(s) in top-k recommendations are used as training rewards (Zhao et al., 2017; Zhang et al., 2020); (3) WhiteBox SimAlter: alternating interactions with similar items and target item(s), where similar items could be computed based on similarity of the white-box item embeddings. (4) In addition, we experiment with the black-box recommender as a surrogate model and perform our attacks (BlackBox-Ours.).

General Attack Performance Comparisons. In Figure 5, we present the profile pollution performance compared with baselines on three different datasets. (1) Comparing our results with BlackBox-Ours. black-box models show the vulnerabilities to pollution attacks from extracted white-box model generally. We notice that the attacking performance of distilled models is comparable in ML-1M, but leads to worsening metrics as datasets become sparser and recommenders become harder to extract, for eaxmple, the average metrics of NARM on ML-1M reach 94.8% attack performance of the black-box model, compared to 66.2% on Steam and 55.6% on Beauty. (2) On all datasets, our method achieves the best targeted item promotion results. For example, on Steam, N@10 scores of the targeted items are significantly improved from 0.070 to 0.381. This shows the benefit of exploiting the surrogate model from model extraction, and our attacking method is designed as an adversarial example method (Goodfellow et al., 2014a), which surpasses heuristic methods that lack knowledge of the victim models. (3) Empricially, we find that the robustness varies for different victim recommenders. For example, results on Steam and Beauty datasets show SASRec is the most vulnerable model under attacks in our experiments. For instance, N@10 of SASRec increases from 0.071 to 0.571 on average on Steam and Beauty datasets, but N@10 of NARM increases from 0.0680 to 0.286.

ML-1M Steam Beauty
(l)3-5 (l)6-8 (l)9-11 Popularity Attack NARM SASRec BERT4Rec NARM SASRec BERT4Rec NARM SASRec BERT4Rec
head before 0.202 0.217 0.201 0.313 0.327 0.311 0.261 0.246 0.260
ours 0.987 0.981 0.968 0.850 0.745 0.714 0.650 0.825 0.521
middle before 0.037 0.034 0.036 0.012 0.012 0.009 0.023 0.030 0.027
ours 0.902 0.876 0.901 0.341 0.585 0.152 0.106 0.556 0.105
tail before 0.005 0.008 0.009 0.000 0.000 0.000 0.002 0.009 0.002
ours 0.701 0.760 0.760 0.017 0.160 0.000 0.001 0.557 0.010
Table 7. Profile pollution attacks to different sequential models for items with different popularity, reported in N@10.

Items w/ Different Popularities. Table 7 shows our profile pollution attacks to items with different popularities. We group target items using the following rules (Anderson, 2006): head denotes the top 20%, tail is the bottom 20% and middle is the rest according to the item appearance frequency (popularity). From Table 7, our attack method is effective for items with different popularities. But in all scenarios, ranking results after attacking decrease as the popularity of the target item declines; popular items are generally more vulnerable under targeted attacks and could easily be manipulated for gaining attention. Unpopular items, however, are often harder to attack.

5.5. RQ4: Can We Perform Data Poisoning Attacks using the Extracted Model?

Figure 6. Data poisoning attacks performance comparisons with different methods

Setup. Different from profile pollution, we do not select a single item as a target and use target groups instead as the attack objective to avoid a large number of similar injection profiles and accelerate retraining. Target selection is identical to profile pollution and we randomly select an attack target from the 25 items in each step during the generation of adversarial profiles. Then, the black-box recommender is retrained once and tested on each of the target items, we present the average results in Figure 6. We follow (Tang et al., 2020) to generate profiles equivalent to of the number of users from the original dataset and adopt the same baseline methods in profile pollution.

General Attack Performance Comparisons. (1) Our methods surpass other baselines but overall promotion is not as effective as profile pollution. It is because we adopt multiple targets for simultaneous poisoning, and profile pollution attacks specific user with profiles information, where attacking examples can be stronger. (2) Compared to RandAlter, the proposed adversarial profile generation further enhances this advantage by connecting the unlikely popular items with targets and magnifying the bias for more exposure of our target items. For example in Beauty, the average N@10 is 0.066 before attack against 0.240 by our method across three models. Moreover, we notice that DQN performs worse than in profile pollution and could occasionally lead to performance deterioration. Potential reasons for the performance drop could be: Less frequent appearance of target items against the co-visitation approach; Updated model parameters are independent from the generated fake profiles, as in (Tang et al., 2020). (3) Comparable results between BlackBox-Ours. and WhiteBox-Ours. suggest the bottleneck for data poisoning can be generation algorithm instead of white-box similarity.

ML-1M Steam Beauty
(l)3-5 (l)6-8 (l)9-11 Popularity Attack NARM SASRec BERT4Rec NARM SASRec BERT4Rec NARM SASRec BERT4Rec
head before 0.202 0.217 0.201 0.313 0.327 0.311 0.261 0.246 0.260
ours 0.205 0.351 0.213 0.347 0.352 0.324 0.425 0.477 0.364
medium before 0.037 0.034 0.036 0.012 0.012 0.009 0.023 0.030 0.027
ours 0.053 0.067 0.048 0.026 0.021 0.014 0.217 0.309 0.135
tail before 0.005 0.008 0.009 0.000 0.000 0.000 0.002 0.009 0.002
ours 0.016 0.032 0.017 0.004 0.004 0.004 0.086 0.235 0.036
Table 8. Data poisoning attacks to different sequential models for items with different popularity, reported as N@10.

Items w/ Different Popularities. Table 8 reveals the poisoning effectiveness as a function of target popularity. In contrast to the numbers in Table 7, the relative improvements are more significant for middle and tail items. For example, the relative improvement for head is 30.8%, compared to over 300% for middle items and over 1000% for tail items in N@10. The results suggest that data poisoning is more helpful in elevating exposure for less popular items, while the promotion attacks of popular items via profile injection can be harder in this case.

6. Conclusion And Future Work

In this work, we systematically explore the feasibility and efficacy of stealing and attacking sequential recommender systems with different state-of-the-art architectures. First, our experimental results show that black-box models can be threatened by model extraction attacks. That is, we are able to learn a white-box model which behaves similarly (for instance, with 0.747 top- agreement on the ML-1M dataset) to the black-box model without access to training data. This suggests that attacks on the white-box could be transferred to the black-box model. To verify this intuition, we conduct further experiments to study the vulnerability of black-box models using profile pollution and data poisoning attacks. Our experiments show that the performance of a well-trained black-box model can be drastically biased and corrupted under both attacks.

For future work, first, we can extend our framework to more universal settings. In particular, how can we perform model extraction attacks for matrix factorization models or graph-based models? Second, it would be interesting to develop defense algorithms or other novel robust training pipelines, so that sequential recommender systems could be more robust against adversarial and data poisoning attacks. Third, active learning can be applied to find more effective sampling strategies for model extraction with fewer queries.

References

  • C. Anderson (2006) The long tail: why the future of business is selling less of more. Hachette Books. Cited by: §5.4.
  • R. Burke, B. Mobasher, and R. Bhaumik (2005) Limited knowledge shilling attacks in collaborative filtering systems. In

    Proceedings of 3rd international workshop on intelligent techniques for web personalization (ITWP 2005), 19th international joint conference on artificial intelligence (IJCAI 2005)

    ,
    pp. 17–24. Cited by: §1, §2.2.
  • K. Cho, B. van Merrienboer, Ç. Gülçehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. In EMNLP, Cited by: 1st item.
  • K. Christakopoulou and A. Banerjee (2019) Adversarial attacks on an oblivious recommender. In Proceedings of the 13th ACM Conference on Recommender Systems, pp. 322–330. Cited by: §1, §2.2.
  • J. Devlin, M. Chang, K. Lee, and K. N. Toutanova (2018) BERT: pre-training of deep bidirectional transformers for language understanding. External Links: Link Cited by: §2.1.
  • M. Fang, N. Z. Gong, and J. Liu (2020) Influence function based data poisoning attacks to top-n recommender systems. In Proceedings of The Web Conference 2020, pp. 3019–3025. Cited by: §1, §2.2.
  • I. J. Goodfellow, J. Shlens, and C. Szegedy (2014a) Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572. Cited by: §2.2, §4.2.1, §4.2.2, §5.4.
  • I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio (2014b) Generative adversarial nets. In Advances in neural information processing systems, pp. 2672–2680. Cited by: §2.1.
  • I. Gunes, C. Kaleli, A. Bilge, and H. Polat (2014) Shilling attacks against recommender systems: a comprehensive survey. Artificial Intelligence Review 42 (4), pp. 767–799. Cited by: §2.2.
  • F. M. Harper and J. A. Konstan (2015) The movielens datasets: history and context. Acm transactions on interactive intelligent systems (tiis) 5 (4), pp. 1–19. Cited by: §5.1.1.
  • R. He and J. McAuley (2016)

    Fusing similarity models with markov chains for sparse sequential recommendation

    .
    In 2016 IEEE 16th International Conference on Data Mining (ICDM), pp. 191–200. Cited by: §1.
  • X. He, L. Liao, H. Zhang, L. Nie, X. Hu, and T. Chua (2017) Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web, pp. 173–182. Cited by: §1, §2.2.
  • B. Hidasi, A. Karatzoglou, L. Baltrunas, and D. Tikk (2015)

    Session-based recommendations with recurrent neural networks

    .
    arXiv preprint arXiv:1511.06939. Cited by: §1, §4.1.1, §4.1.2.
  • G. Hinton, O. Vinyals, and J. Dean (2015) Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531. Cited by: §3.3.1, §3, §4.1.2, §4.1.2.
  • H. Huang, J. Mu, N. Z. Gong, Q. Li, B. Liu, and M. Xu (2021) Data poisoning attacks to deep learning based recommender systems. arXiv preprint arXiv:2101.02644. Cited by: §2.2, §2.2.
  • D. Jin, Z. Jin, J. T. Zhou, and P. Szolovits (2019) Is bert really robust. A Strong Baseline for Natural Language Attack on Text Classification and Entailment. Cited by: §5.1.3.
  • W. Kang and J. McAuley (2018) Self-attentive sequential recommendation. In 2018 IEEE International Conference on Data Mining (ICDM), pp. 197–206. Cited by: §1, 1st item, §4.1.1, 2nd item, 1st item, §5.1.1, §5.1.2, §5.1.3, §5.1.4, Table 2.
  • S. Kariyappa, A. Prakash, and M. Qureshi (2020) MAZE: data-free model stealing attack using zeroth-order gradient estimation. arXiv preprint arXiv:2005.03161. Cited by: §1, §2.1.
  • D. P. Kingma and J. Ba (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980. Cited by: §5.1.3.
  • K. Krishna, G. S. Tomar, A. P. Parikh, N. Papernot, and M. Iyyer (2020) Thieves on sesame street! model extraction of bert-based apis. In International Conference on Learning Representations, External Links: Link Cited by: §1, §2.1, §4.1.2.
  • A. Kurakin, I. Goodfellow, and S. Bengio (2016) Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533. Cited by: §2.2.
  • S. K. Lam and J. Riedl (2004) Shilling recommender systems for fun and profit. In Proceedings of the 13th international conference on World Wide Web, pp. 393–402. Cited by: §1, §2.2.
  • S. Lee, S. Hwang, and S. Ryu (2017) All about activity injection: threats, semantics, and detection. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 252–262. Cited by: §2.2, 1st item.
  • B. Li, Y. Wang, A. Singh, and Y. Vorobeychik (2016) Data poisoning attacks on factorization-based collaborative filtering. arXiv preprint arXiv:1608.08182. Cited by: §1, §2.2.
  • J. Li, P. Ren, Z. Chen, Z. Ren, T. Lian, and J. Ma (2017) Neural attentive session-based recommendation. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 1419–1428. Cited by: §1, 1st item, §4.1.1, §4.1.2, 1st item, §5.1.2, §5.1.3, Table 2.
  • D. Lowd and C. Meek (2005) Adversarial learning. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pp. 641–647. Cited by: §1, §2.1.
  • J. McAuley, C. Targett, Q. Shi, and A. Van Den Hengel (2015) Image-based recommendations on styles and substitutes. In Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval, pp. 43–52. Cited by: §5.1.1.
  • S. Merity, C. Xiong, J. Bradbury, and R. Socher (2016) Pointer sentinel mixture models. arXiv preprint arXiv:1609.07843. Cited by: §2.1.
  • J. Ni, J. Li, and J. McAuley (2019) Justifying recommendations using distantly-labeled reviews and fine-grained aspects. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, pp. 188–197. External Links: Link, Document Cited by: §5.1.1.
  • T. Orekondy, B. Schiele, and M. Fritz (2019) Knockoff nets: stealing functionality of black-box models. In

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    ,
    pp. 4954–4963. Cited by: §1, §2.1.
  • S. Pal, Y. Gupta, A. Shukla, A. Kanade, S. Shevade, and V. Ganapathy (2019) A framework for the extraction of deep neural networks by leveraging public data. arXiv preprint arXiv:1905.09165. Cited by: §1, §2.1.
  • N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, and A. Swami (2017) Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia conference on computer and communications security, pp. 506–519. Cited by: §1, §2.1, §2.2.
  • C. Ren, Y. Zhang, H. Xue, T. Wei, and P. Liu (2015) Towards discovering and understanding task hijacking in android. In 24th USENIX Security Symposium (USENIX Security 15), pp. 945–959. Cited by: §2.2, 1st item.
  • S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme (2012) BPR: bayesian personalized ranking from implicit feedback. arXiv preprint arXiv:1205.2618. Cited by: §1.
  • S. Rendle, C. Freudenthaler, and L. Schmidt-Thieme (2010) Factorizing personalized markov chains for next-basket recommendation. In Proceedings of the 19th international conference on World wide web, pp. 811–820. Cited by: §1.
  • J. Song, Z. Li, Z. Hu, Y. Wu, Z. Li, J. Li, and J. Gao (2020) Poisonrec: an adaptive data poisoning framework for attacking black-box recommender systems. In 2020 IEEE 36th International Conference on Data Engineering (ICDE), pp. 157–168. Cited by: §4.2.2, §5.4.
  • F. Sun, J. Liu, J. Wu, C. Pei, X. Lin, W. Ou, and P. Jiang (2019) BERT4Rec: sequential recommendation with bidirectional encoder representations from transformer. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 1441–1450. Cited by: §1, 1st item, 3rd item, 1st item, §5.1.1, §5.1.2, §5.1.3, Table 2.
  • J. Tang and K. Wang (2018) Personalized top-n sequential recommendation via convolutional sequence embedding. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 565–573. Cited by: §1.
  • J. Tang, H. Wen, and K. Wang (2020) Revisiting adversarially learned injection attacks against recommender systems. In Fourteenth ACM Conference on Recommender Systems, pp. 318–327. Cited by: §1, §2.2, §3.3.2, §5.1.3, §5.5, §5.5.
  • F. Tramèr, F. Zhang, A. Juels, M. K. Reiter, and T. Ristenpart (2016) Stealing machine learning models via prediction apis. In 25th USENIX Security Symposium (USENIX Security 16), pp. 601–618. Cited by: §1, §2.1.
  • X. Xing, W. Meng, D. Doozan, A. C. Snoeren, N. Feamster, and W. Lee (2013) Take this personally: pollution attacks on personalized services. In 22nd USENIX Security Symposium (USENIX Security 13), pp. 671–686. Cited by: §2.2.
  • G. Yang, N. Z. Gong, and Y. Cai (2017) Fake co-visitation injection attacks to recommender systems.. In NDSS, Cited by: §2.2, §2.2, §4.2.2.
  • W. Zeller and E. W. Felten (2008) Cross-site request forgeries: exploitation and prevention. Bericht, Princeton University. Cited by: §2.2.
  • H. Zhang, Y. Li, B. Ding, and J. Gao (2020) Practical data poisoning attack against next-item recommendation. In Proceedings of The Web Conference 2020, pp. 2458–2464. Cited by: §1, §1, §2.2, §5.4.
  • X. Zhao, L. Zhang, L. Xia, Z. Ding, D. Yin, and J. Tang (2017)

    Deep reinforcement learning for list-wise recommendations

    .
    arXiv preprint arXiv:1801.00209. Cited by: §5.4.
  • M. Zhou, J. Wu, Y. Liu, S. Liu, and C. Zhu (2020) DaST: data-free substitute training for adversarial attacks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 234–243. Cited by: §1, §2.1.