Fighting Fire with Fire: Using Antidote Data to Improve Polarization and Fairness of Recommender Systems

The increasing role of recommender systems in many aspects of society makes it essential to consider how such systems may impact social good. Various modifications to recommendation algorithms have been proposed to improve their performance for specific socially relevant measures. However, previous proposals are often not easily adapted to different measures, and they generally require the ability to modify either existing system inputs, the system's algorithm, or the system's outputs. As an alternative, in this paper we introduce the idea of improving the social desirability of recommender system outputs by adding more data to the input, an approach we view as as providing `antidote' data to the system. We formalize the antidote data problem, and develop optimization-based solutions. We take as our model system the matrix factorization approach to recommendation, and we propose a set of measures to capture the polarization or fairness of recommendations. We then show how to generate antidote data for each measure, pointing out a number of computational efficiencies, and discuss the impact on overall system accuracy. Our experiments show that a modest budget for antidote data can lead to significant improvements in the polarization or fairness of recommendations.



page 7


User Fairness in Recommender Systems

Recent works in recommendation systems have focused on diversity in reco...

DeepFair: Deep Learning for Improving Fairness in Recommender Systems

The lack of bias management in Recommender Systems leads to minority gro...

Understanding and Mitigating Multi-Sided Exposure Bias in Recommender Systems

Fairness is a critical system-level objective in recommender systems tha...

Examining the Impact of Algorithm Awareness on Wikidata's Recommender System Recoin

The global infrastructure of the Web, designed as an open and transparen...

Towards Explainable Scientific Venue Recommendations

Selecting the best scientific venue (i.e., conference/journal) for the s...

A Framework for Fairness in Two-Sided Marketplaces

Many interesting problems in the Internet industry can be framed as a tw...

Job Recommender Systems: A Review

This paper provides a review of the job recommender system (JRS) literat...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Recommender systems are at the core of many online platforms that influence the choices we make in our daily lives ranging from what news we read (e.g., Facebook, Twitter) and whose products and services we buy (e.g., Amazon, Uber, Netflix) to whom we meet (e.g., OKCupid, Tinder). As users increasingly rely on recommender systems to make life-affecting choices, concerns are being raised about their inadvertent potential for social harm. Recently, studies have shown how recommender systems predicting user preferences might offer unfair or unequal quality of service to indvidual (or groups of) users (Beutel et al., 2017b; Burke et al., 2018) or lead to societal polarization by increasing the divergence between preferences of individual (or groups of) users (Dandekar et al., 2013).

Collaborative filtering recommender systems rely on user-provided data to learn models that are used to predict unknown user preferences. As a result, the recommendations made by such systems may carry undesired properties which are inherent in the observed data. A natural approach then is to consider transformations of input data that ameliorate those properties.

In this paper we explore a new approach. Rather than transforming the system’s exisiting input data, we investigate whether simply augmenting the input with additional data can improve the social desirability of the resulting recommendations. We explore this question by developing a generic framework that can be used to improve a variety of socially relevant properties of recommender systems. Our framework turns a technique that has previously been thought of as anti-social attacks on learning systems into a method with socially desirable outcomes.

As a strategy for improving recommendations, the data augmentation approach has multiple advantages. Adding new input data may be easier than modifying existing data inputs, as when a system is already running. Additional data can be provided to the system by a third-party who does not need the ability to modify the system’s existing input, nor the ability to modify the system’s algorithms. Further, the approach is applicable to a wide range of socially relevant properties of a system – essentially any property that can be expressed as a differentiable function of the systems inputs (ratings) and/or outputs (predictions).

The framework we develop starts from an existing matrix-factorization recommender system organized according to users and items, that has already been trained with some input (ratings) data. We consider the addition to the system of new users who provide ratings of existing items. The new users’ ratings are chosen according to our framework, so as to improve a socially relevant property of the recommendations that are provided to the original users. We call the additional ratings provided ‘antidote’ data (by analogy to existing work studying data poisoning).

In this paper we instantiate the framework by proposing metrics that capture the polarization and unfairness of the system’s recommendations. These metrics build on and extend previous proposals, and include measures of both individual and group unfairness. We show how to generate antidote data for these metrics, and we present a number of computational efficiencies that can be exploited. In the process we consider the relationship between improvements to socially-relevant measures and changes to overall system accuracy. Finally, we show that the small amounts of antidote data (typically on the order of 1% new users) can generate a dramatic improvement (on the order of 50%) in the polarization or the fairness of the system’s recommendations.

2. Related Work

In this section, we first discuss how our measures of fairness and polarization for recommender systems relate to those discussed in prior works. Later, we describe how we leverage insights and methods explored in adversarial machine learning to cause social harm towards social good in recommender systems.

Fairness in machine learning and recommender systems: The past years have witnessed a growing awareness about the potential for social harm by the use of machine learning algorithms in life-affecting decision making scenarios (Barocas and Selbst, 2016; Boyd and Crawford, 2012). In response, researchers have proposed numerous notions and measures of fairness for machine learning tasks as varied as classification (Zafar et al., 2017a, b; Hardt et al., 2016; Zemel et al., 2013), regression (Berk et al., 2017), ranking (Biega et al., 2018; Singh and Joachims, 2018; Zehlike et al., 2017), and set selection (Celis et al., 2016). These proposed notions fall under two broad categoies: those measuring unfairness at the level of individual users and those that measure unfairness at the level of user groups (Dwork et al., 2012). The group-level unfairness measures can be further sub-divided into those that prohibit the use of information related to a user’s sensitive group membership when making predictions and those that require users belonging to different sensitive groups to receive, on average, equal quality of service. The quality of service received by a user can in turn be measured either conditioned or unconditioned on the service outcomes deserved by the user.

Compared to learning tasks such as classification and regression, few studies have explored fairness notions in the context of recommender systems. Recently, Burke et al. (Burke et al., 2018) observed that recommender systems predicting user preferences over items would have to consider fairness from two-sides namely, from the perspective of users receiving the recommendations and from the perspective of items being recommended. Some of the early works by Kamishima et. al. (Kamishima et al., 2012, 2018; Kamishima and Akaho, 2017) focussed on notions of group-level fairness, where the learning model is modified to ensure that item recommendations are independent of users’ features revealing sensitive group membership such as race and gender. More recently, Beutel et. al. (Beutel et al., 2017b) and Yao et. al. (Yao and Huang, 2017) have defined notions of group-level fairness in recommender systems based on the accuracy of predictions across different groupings of users or items.

Novel Contributions: Here, we not only build upon the group-level notions of fariness proposed by Beutel and Yao (by generalizing them to scenarios with more than two groups), but we also extend them to individual-level. We further note that our fairness notions can be applied either from the perspective of users or items.

Mechanims for fair machine learning and recommender systems: Prior works have explored a number of approaches to incorporating fairness in learning models and recommender systems. These approaches can be broadly categorized into those that rely on (i) pre-processing, i.e., transform the training data to reduce the potential for unfair outcomes when using traditional learning models (Kamiran and Calders, 2012; Calmon et al., 2017), (ii) in-processing, i.e., change the learning objectives and models to ensure fair outcomes even using unmodified training data (Kamishima et al., 2011; Agarwal et al., 2018), and (iii) post-processing, i.e., modify potentially unfair outcomes from existing pre-trained learning models (Hardt et al., 2016; Corbett-Davies et al., 2017).

Novel Contributions: In this paper, we explore a different approach to incorporating our fairness notions in recommender systems. Our approach is in contrast to existing approaches to fair recommendations that primarily rely on in-processing (Burke et al., 2018; Kamishima et al., 2012). Unlike in-processing approaches, our approach does not require us to modify the recommendation algorithm for each of our desired notions of fairness.

Leveraging adversarial machine learning for social good: Our approach relies on methods that have been traditionally used in adversarial learning literature to cause social harm (Huang et al., 2011). Our key insight is that we can retarget adversarial methods designed to “poison” training data and cause social harm to generate “antidote” training data for social good. Specifically, our antidote data generation methods are inspired by prior work on data poisoning attacks on factorization-based collaborative filtering (Li et al., 2016).

Most pre-processing approaches target learning new fair (latent and transformed) representations of original data. Recently, Beutel et. al. (Beutel et al., 2017a)

leveraged adversarial training procedures to remove information about sensitive group membership from the latent representations learned by a neural network. In contrast, our approach leaves the original training data untouched and instead adds new antidote data to achieve fairness objectives. As our evaluation results presented later in the paper will show, by leaving the original training data unmodified, our approach also achieves good overall prediction accuracy (the traditional objective of recommender algorithms).

Polarization: Polarization refers to the degree to which opinions, views, and sentiments diverge within a population. Several prior works have raised and explored concerns that recommender systems might increase societal polarization by tailoring recommendations to individual user’s preferences and trapping users in their own “personalized filter bubbles” (Pariser, 2011; Hannak et al., 2013). Dandekar et. al (Dandekar et al., 2013) show how many traditional recommender algorithms used on Internet platforms can lead to polarization of user opinions in society.

Novel Contributions: We propose to measure the polarization of a recommender system as the extent to which predicted ratings for items vary (diverge) across users. Our polarization metric is consistent with those proposed in (Dandekar et al., 2013; Matakos et al., 2017). We show how our antidote data generation framework can be used to target reducing (or in certain scenarios, increasing) polarization in predicted ratings.

3. Optimal Antidote Data Problem

We start by presenting the system setup, notation, and problem definition. Assume is a partially observed rating matrix of users and items such that element denotes the rating given by user to item . Let be the set of indices of known ratings in . Also denotes the indices of known item ratings for user , and denotes the indices of known user ratings for item .

For a matrix , is a matrix whose elements at are

and zero elsewhere. Similarly, for a vector

, is a vector whose elements at are the corresponding elements of and zero elsewhere. Throughout the paper, we denote the column of by the vector and the row of by the vector . All vectors are column vectors.

We assume a factorization based collaborative filtering algorithm is applied to estimate the unknown ratings in

, i.e., for each user and item we find -dimensional representations and such that and the rating is modeled by .

More specifically, we consider a factorization algorithm that finds factors and by solving the following optimization problem:


where columns of are the user latent vectors, and columns of are the item latent vectors. The first term in 1 denotes the estimation error over known elements of and the second term is an -norm regularizer added to avoid overfitting. The unknown ratings are then estimated by setting .

Figure 1. The effect of antidote data on a matrix factorization system. Initially the system learns factors and from a partially observed rating matrix . The latent factors are then used to find the estimated rating matrix which is an input to the socially relevant metric . Adding antidote ratings introduces the new user latent factor and modifies the item latent factor , generating a new that improves .

We can think of our factorization algorithm as a function that maps a partially observed rating matrix to matrices and , and has additional parameters and , i.e, . We assume that the factorization rank and the regularizer parameter are set in a validation phase and remain fixed afterwards and we use and interchangeably throughout the paper.

We use to denote the socially relevant objective function that we seek to optimize by adding antidote data. is a function of estimated ratings and possibly (depending on the objective) other parameters such as original ratings, user labels, etc. For example, consider an objective that minimizes the difference of average estimation errors between two groups of users. In that case, is a function defined over , , and another parameter that indicates the group membership of each user. The specific objective functions we study in this paper are presented in Section 5. Now we can formally state the optimal antidote data problem:

Problem 1 (Optimal Antidote Problem).

Given a partially observed rating matrix , a budget , a factorization algorithm , and an objective function , find the antidote data such that is optimized when is applied jointly on and .

Note that we may want to either maximize or minimize depending on the objective. Also, although in our notation corresponds to a set of artificial users, we can apply problem 1 to generate a set of artificial items by using the symmetry of the problem, i.e., by transposing .

Although some objective functions have additional parameters such as the original observed ratings () or a list of group memberships (which we denote ), adding antidote data only affects the output of the factorization algorithm and hence the rating estimations . Therefore, we denote the general objective function by instead of for notational convenience. Assuming our goal is to minimize some objective function , we can rewrite problem 1 as:


where is the set of feasible antidote data matrices.

Let denote the factorization algorithm when applied jointly on the original and the antidote data. In this case, the output consists of the item latent vectors forming the columns of factor , and the user latent vectors which can be split into a matrix of original users latent vectors , and a matrix of antidote users latent vectors ; therefore, we have .

Furthermore, is a function of original users latent vectors and item latent vectors111Note that here and are the users and items latent vectors after adding the antidote data to the system, which can be different from initial and ., i.e., . This allows us to write (2) in the explicit form:


In other words, we are looking for antidote data that modifies the outputs of such that is modified to optimize . Figure 1 shows a schematic representation of the antidote data effect on matrix factorization models. In the next section, we introduce an iterative method to solve (3).

4. Computing Antidote Data

In this section we introduce the framework for generating antidote data. We apply a projected gradient descent/ascent algorithm (GD/GA) to optimize the antidote data with respect to a socially relevant objective function. In section 4.1 we review a gradient descent method, introduced in (Li et al., 2016), for optimizing data poisoning attacks on matrix factorization models, and which we adapt to optimize antidote data. Then, in section 4.2 we show how the characteristics of the antidote problem can be exploited for significant improvements in algorithmic efficiency.

4.1. A Projected Gradient Descent Approach

In this section we describe a projected gradient descent algorithm to solve the constrained optimization problem (2). A parallel approach is taken in (Li et al., 2016) for optimizing data poisoning attacks, which is itself an instance of the more general machine teaching problem introduced in (Mei and Zhu, 2015). We note that the framework introduced in (Mei and Zhu, 2015) can be used to extend the applicability of antidote data approach beyond matrix factorization models.

The algorithm starts from an initial antidote data with size of a given budget. At each iteration, the factorization algorithm is applied jointly on the original data and the current antidote data to find updated factors ,,, and estimated ratings . Then the gradient of the antidote utility with respect to antidote data at the current point is computed and the algorithm chooses a step size and updates the antidote data. After each update, a projection function is applied to get a feasible solution. In this paper we only consider range constraints on the ratings, i.e., for each rating we assume where and indicate the minimum and maximum feasible rating in the system. Therefore the projection function simply truncates all the ratings in at and .

Algorithm 1 presents the details of our antidote data optimization method. If the goal is to maximize , we can apply a gradient ascent algorithm by simply changing the sign of the gradient step in line 6. The learning algorithm is an input to Algorithm 1. This is a realistic assumption in a white-box scenario, i.e., a party with the full knowledge of the recommender system seeks to generate antidote data, which is an important case. However, we emphasize that there are settings in which other parties with only partial knowledge of the system can successfully adopt the antidote data approach as well. First of all, recent work (Wang and Gong, 2018) introduces a method for estimating the hyper-parameters of a learning algorithm. Using that method we need not input to Algorithm 1, instead only providing the original factors . Moreover, in Section 6

we introduce heuristic algorithms that require less information about the recommender system than does Algorithm


Input: Observed ratings , budget , factorization algorithm , utility , feasible set
Output: Antidote data
Initialization : initialize ,
1 while convergence do
4        Compute
5        Find step size
Algorithm 1 Optimizing antidote data via projected gradient descent

In order to compute in line 4 of algorithm 1, we consider the explicit form of the objective function given in (3

). Applying the chain rule we get:


is the Jacobian matrix that contains partial derivatives of factors with respect to each element in . These partial derivatives can be approximately computed by exploiting the KKT conditions of the factorization problem as explained in (Li et al., 2016) (Mei and Zhu, 2015). However, in section 4.2 we show cases where the full computation of such partial derivatives is not required and we explain how to derive the necessary elements.

By applying the chain rule one more time on we get:


The first term in (5) is the gradient of the antidote utility with respect to the estimated ratings. In this paper we only consider differentiable utilities, as described in more detail in section 5.

The second term in (5) is the gradient of the estimated ratings with respect to factors . This term is straighforward to compute since the ratings are linear in each factor, i.e., .

In this paper we do not make assumptions (e.g. convexity) about the antidote utility other than being differentiable; the framework is a general method to improve a socially relevant metric rather than one that seeks the global optimum of function . However, we note that introducing antidote objectives with certain provable properties, which can provide convergence guarantees or more efficient ways to find the step size in Algorithm 1, is a potential direction for future research.

4.2. Efficient Computation of the Gradient Step

In this section we show how to further simplify (5) to make the update step of Algorithm 1 more efficient.

First, we write in terms of the block matrices that contain the partial derivatives of the estimated ratings in with respect to each factor , i.e.222For matrices and , we use to denote an matrix that contains the partial derivatives for each and ., . Notice that does not depend on and therefore .

Furthermore, we write in terms of the block matrices that contain the partial derivatives of each factor with respect to each element in , i.e., . Assuming that an infinitesimal change in only results in first order updates in vectors and , we get .

Exploiting the fact that , we can simplify (5) to:


Now we derive for each element of the antidote data . Let be the item vectors forming the columns of . Then starting from the last term in (6) and assuming first order updates, we know that is non-zero only if and can be approximately computed as333Details are provided in appendix A.1.:


On the other hand, if and an -dimensional zero vector otherwise. Therefore, we need to compute only for and we have:


Let be a matrix formed by reshaping into an matrix such that . Then we can write (8) as:


where .

By using (9) instead of the general formula in (5) we can significantly reduce the number of computations required for finding the gradient of the utility function with respect to the antidote data. Furthermore, the term appears in all the partial derivatives that correspond to elements in column of and can be precomputed in each iteration of the algorithm and reused for computing partial derivatives with respect to different antidote users.

5. Social Objective Functions

The previous section developed a general framework for improving various properties of recommender systems; in this section we show how to apply that framework specifically to issues of polarization and fairness.

As described in Section 2, polarization is the degree to which opinions, views, and sentiments diverge within a population. Recommender systems can capture this effect through the ratings that they present for items. To formalize this notion, we define polarization in terms of the variability of predicted ratings when compared across users. In fact, we note that both very high variability, and very low variability of ratings may be undesirable. In the case of high variability, users have strongly divergent opinions, leading to conflict. Recent analyses of the YouTube recommendation system have suggested that it can enhance this effect (Nicas, 2018; O’Callaghan et al., 2015). On the other hand, the convergence of user preferences, i.e., very low variability of ratings given to each item across users, corresponds to increased homogeneity, an undesirable phenomenon that may occur as users interact with a recommender system (Chaney et al., 2017). As a result, in what follows we consider using antidote data in both ways: to either increase or decrease polarization.

As also described in Section 2, unfairness is a topic of growing interest in machine learning. Following the discussion in that section, we consider a recommender system fair if it provides equal quality of service (i.e., prediction accuracy) to all users or all groups of users (Zafar et al., 2017b).

Next we formally define the metrics that specify the objective functions associated with each of the above objectives. Since the gradient of each objective function is used in the optimization algorithm, for reproducibility we provide the details about derivation of the gradients in appendix A.2.

5.1. Polarization

To capture polarization, we seek to measure the extent to which the user ratings disagree. Thus, to measure user polarization we consider the estimated ratings , and we define the polarization metric as the normalized sum of pairwise euclidean distances between estimated user ratings, i.e., between rows of . In particular:


The normalization term in (10) makes the polarization metric identical to the following definition: 444We can derive it by rewriting (10) as .



is the variance of estimated user ratings for item

. Thus this polarization metric can be interpreted either as the average of the variances of estimated ratings in each item, or equivalently as the average user disagreement over all items.

5.2. Fairness

Individual fairness. For each user , we define , the loss of user , as the mean squared estimation error over known ratings of user :


Then we define the individual unfairness as the variance of the user losses:555Note that for a set of equally likely values the variance can be expressed without referring to the mean as:


To improve individual fairness, we seek to minimize .

Group fairness. Let be the set of all users/items and be a partition of users/items into groups, i.e., . We define the loss of group as the mean squared estimation error over all known ratings in group :


For a given partition , we define the group unfairness as the variance of all group losses:


Again, to improve group fairness, we seek to minimize .

5.3. Accuracy vs. Social Welfare

Adding antidote data to the system to improve a social utility will also have an effect on the overall prediction accuracy. Previous works have considered social objectives as regularizers or constraints added to the recommender model (eg, (Burke et al., 2018; Zafar et al., 2017c; Kamishima et al., 2011)), implying a trade-off between the prediction accuracy and a social objective.

However, in the case of the metrics we define here, the relationship is not as simple. Considering polarization, we find that in general, increasing or decreasing polarization will tend to decrease system accuracy. In either case we find that system accuracy only declines slightly in our experiments; we report on the specific values in Section 6. Considering either individual or group unfairness, the situation is more subtle. Note that our unfairness metrics will be exactly zero for a system with zero error (perfect accuracy). As a result, it is possible that as the system decreases unfairness, overall accuracy may either increase or decrease. We illustrate these effects in our experiments in Section 6.

6. Effectiveness

(a) Minimizing polarization
(b) Maximizing polarization
Figure 2. Modifying user polarization.

In this section we use the tools developed in previous sections to study the effectiveness of antidote data in varying the polarization and reducing the unfairness of a matrix-factorization based recommender system.

We consider a recommender system that estimates unknown ratings by solving the regularized matrix factorization problem as defined by (1). We implemented an alternating least squares algorithm (Hastie et al., 2015; Hardt, 2014) to find the factors. We use the MovieLens 1M dataset which contains around 1 million ratings of 4000 movies made by 6000 users, with ratings on a 5-point scale (Harper and Konstan, 2016). We choose the 1000 most frequently rated movies, and use different subsets of users in different experiments as described below.

For each dataset we perform a validation process to choose the hyper-parameters so as to obtain realistic settings. The hyper-parameters are selected based on the average root-mean-square error (RMSE) of the factorization in multiple random splits of observed ratings into training and validation sets. We assume that the hyper-parameters are fixed during the antidote data generation process since the antidote data is generated for a fixed recommender system.

First we show the effectiveness of antidote data in modifying the user polarization as defined in section 5.1. In section 6.2 we describe different heuristics that can significantly speed up the construction of antidote data. Finally, section 6.3 demonstrates the effectiveness of applying antidote data for improving fairness.

6.1. Polarization

To explore modifying user polarization, we choose a random subset of 1000 users yielding a matrix in which 11% of the elements are known. As previously mentioned, it may be of interest to either increase or decrease the polarization metric in different scenarios. We present an example for each case. We do so by taking advantage of the fact that different hyperparameter combinations can yield models that are very close in overall accuracy but that differ significantly with respect to initial user polarization in the system.

In particular, we observe that the average validation RMSE over ten random splits of observed ratings into training and validation sets for is and for is . However, the polarization () of the estimated rating matrix for is whereas for the polarization goes up to . We use the former setting as an example where the goal is to increase the polarization metric (to avoid homogeneity), and the latter setting as an example of a polarized system where the goal is to reduce polarization.

For each of the maximization and minimization objectives, we compare the performance of the antidote data generation framework with a baseline algorithm. When seeking to minimize polarization, we use baseline_min. This algorithm tries to reduce the variance of estimated ratings in each item by setting the ratings given to the corresponding item in the antidote data to the average of known ratings for that item in the original data. When seeking to maximize polization, we use baseline_max. This algorithm generates antidote data by setting half of the user ratings in each item to the maximum feasible rating value and the other half of the ratings to the minimum feasible rating value.

Furthermore, we consider two different initializations for the optimization process: in the case of GD(fixed init), all the ratings in the initial antidote data are set to the same value. In the case of GD(random init), we run the optimization multiple times starting from random initializations and return the best solution.

(a) Individual fairness
(b) Group fairness
Figure 3. Improving fairness.

Figure 2 compares the effects of adding antidote data constructed by different methods on polarization. After each injection of the antidote data, the new polarization is computed using the original data only, i.e., we ignore the injected data in evaluating polarization. We present our results for different budgets varying from a single antidote user to 5% of the number of original users. We also show the effect of ratings that are randomly generated over the feasible range, when used as antidote data.

(a) Effect on per-item polarization
(b) Effect on rating estimations for Patch Adams (1998)
(c) Effect on top-k recommended items
Figure 4. Minimizing polarization with budget.

Our results show that the antidote data generation framework can successfully either minimize or maximize polarization. Antidote data generated by our method are considerably more effective than the baseline algorithms as well as random data. We observe that a 2% budget is enough to reduce the initial polarization in a polarized setting by 50% and increase the polarization in a less polarized setting by 10%. Furthermore, we observe that random initialization is more effective for minimizing polarization whereas initializing all the antidote ratings from the same value is more effective for maximizing polarization.

To better understand the effect of antidote data on user polarization, in Figure 4 we demonstrate the effect of antidote data with a 1% budget for the minimization case. Note that the effect on estimation error of adding antidote data is negligible: RMSE of rating estimations for known elements changes from to . In other words, antidote data modifies the prediction model such that its predictions still approximately agree with the known ratings but the polarization of the new estimated rating matrix is significantly different.

Figure 3(a) shows the distributions of per-item polarization ( in (11)) along with (the distributional mean) before and after antidote data injection. The figure shows that without antidote data, a small set of items make large contributions to overall polarization – they have quite high variance in ratings, shown by the long distributional tail. The addition of antidote data dramatically reduces this effect, and also significantly lowers from 0.55 to 0.29.

Figure 3(b) shows the effect of adding antidote data on the estimated ratings of Patch Adams (1998), one of the movies for which the variance of estimated ratings is large before adding antidote data. We observe that the distribution of known ratings for this movie indicates a polarized case with two peaks at 2 and 4. The initial rating estimations in this case lie in an interval that is much larger than the range of observed ratings. Adding antidote data modifies the extreme rating estimations; resulting in a unimodal distribution over the range of original ratings.

While the goal of adding antidote data is to modify the system’s predicted ratings, an important use case for such a system is to output the top rated items as the system’s recomendations. Hence, it is important to ask how modifying predicted ratings will change the ranking of unrated items, i.e., the output of a top- recommender system. Therefore, we consider the top- recommended items on a per-user basis and measure the degree of change in the recommendations before and after adding antidote data. We use the Jaccard similarity of the sets of recommended items to measure this change.

Figure 3(c) shows the average of Jaccard similarities across all users. Our results show that the antidote data significantly changes the output of a top- recommender system. For example, adding 1% additional antidote data changes the top-recommended item for 84% of all users. We observe that, in general, as the number of considered top items grows, the effect lessens (Jaccard similarity grows). However, the changes in the set of recommended items are still significant up to .

6.2. Heuristic Algorithms

(a) Minimizing polarization
(b) Individual fairness
Figure 5. Optimal antidote data with budget.

In this section we introduce heuristic algorithms that dramatically reduce the computational cost of antidote data generation. Notice that the computational cost of Algorithm 1 is dominated by performing the matrix factorization algorithm (evaluating ) in each pass through the gradient descent loop. The heuristics are designed based on various approximations that can be made in different steps of Algorithm 1 to minimize the number of times is evaluated. The approximations are motivated by certain patterns observed in the antidote data generated by Algorithm 1.

Figure 5 shows the antidote data generated by GD(random init) for minimizing polarization (Fig. 4(a)) and minimizing individual unfairness (Fig. 4(b)). We observe that: (i) most of the ratings in the resulting antidote data are equal to one of the boundary values in the feasible set, or (0 or 5 in our experiments), and (ii) in the fairness case, most of the users (rows) in the antidote data converge to a nearly-identical pattern of ratings over items, even if they are initialized with different random values.

Based on the above observations, in our evaluations we consider two heuristics for generating antidote data for fairness. The first (heuristic1) offers considerable computational savings, and the second (heuristic2) offers even more savings, while additionally removing the need for access to the factorization algorithm or its hyper-parameters (,)666Pseudocodes are provided in appendix B..

heuristic1 reduces the number evaluations of to a single call by combining observations (i) and (ii). It works by considering the addition of only a single row of antidote data, and computes gradients for that row. Rather than performing gradient descent over a series of small steps, it then simply sets each value in the antidote data row to either or depending on the sign of the gradient. It then replicates the resulting row as many times as dictated by the antidote data budget.

In the case of heuristic2, in addition to using the above observations, we approximate the direction of the gradient without the need to perform matrix factorization, given access to factors and . In this case, access to the factorization algorithm or its hyper-parameters is not required. Notice that (9) can be rewritten as where and . For sufficient level of regularization , we can approximate where is a constant and is an -dimensional vector of ’s. This leads to a modification of heuristic1 in which all the values in column of the antidote data are set to or depending on the sign of .

6.3. Fairness

Algorithm 1 0.5% 1% 2% 5%
GD(random init) 0.1086 0.1084 0.1054 0.1157 0.1086
GD(fixed init) 0.1086 0.1083 0.0929 0.0985 0.0968
heuristic1 0.1086 0.1084 0.0816 0.0800 0.0830
heuristic2 0.1086 0.1084 0.0817 0.0818 0.0811
Table 1. Effect of antidote data on individual unfairness in the held-out ratings. before antidote is 0.1087.

In this section we show how antidote data as generated by our various algorithms improves fairness. We again use the MovieLens dataset; to study group fairness, we group movies by genre as specified in the dataset. In contrast to the case for polarization, the fairness objective is a function of both the known and predicted user ratings. Hence we choose the 1000 most active users and the 1000 most frequently rated movies. This gives us a rating matrix in which 36% of the elements are known. For this dataset we run the matrix factorization algorithm with hyper-parameters .

To verify that adding antidote data improves the fairness of unseen ratings, we hold out 20% of the known ratings per user as a test set. We use the remaining data (training set) to generate antidote data; we then measure the effectiveness of the resulting antidote data in both training and test sets.

We start by assessing the effect of antidote data on fairness in the training data. We show the impact of antidote data on individual unfairness () in Figure 2(a) and on group unfairness () in Figure 2(b). The figures compare the effect of antidote data as generated by four different algorithms: Algorithm 1 with two different initializations as described in Section 6.1, and the two heuristic algorithms introduced in Section 6.2.

The results show that all algorithms improve fairness considerably. In fact most of the benefits of antidote data can be obtained by only adding 1% additional users in the individual fairness case and 0.5% additional users in the group fairness case. The figures also show that the much simpler heuristics, in which all rows of the antidote data are identical, are effective: for individual fairness, they provide almost all the benefits of Algorithm 1 while for group fairness they provide around half of the benefits of Algorithm 1.

Algorithm 1 0.5% 1% 2% 5%
GD(random init) 0.0087 0.0035 0.0041 0.0042 0.0045
GD(fixed init) 0.0087 0.0042 0.0040 0.0040 0.0044
heuristic1 0.0087 0.0055 0.0056 0.0057 0.0058
heuristic2 0.0087 0.0055 0.0056 0.0057 0.0058
Table 2. Effect of antidote data on group unfairness in the held-out ratings. before antidote is 0.0088.

Tables 1 and 2 show the resulting values of the individual and group unfairness metrics in the test set after antidote data addition for different budgets and different algorithms. We observe that the antidote data generated to reduce unfairness in the training data is also effective for reducing unfairness on the the held-out test data. The optimal value (minimum unfairness) in each table is highlighted. We observe that even in the test set, a 2% budget using heuristic1 can reduce individual unfairness by over 25% (from 0.1087 to 0.0800), and group unfairness can be lowered by more than 50% (from 0.0088 to 0.0035) using GD(random init) and a 0.5% budget.

(a) Individual fairness
(b) Group fairness
Figure 6. Antidote data effect on fairness.

Figure 6 provides more insight into how adding antidote data reduces individual and group unfairness. In each case, we consider the setting that reaches the minimum unfairness in the test set as presented in tables 1 and 2.

Figure 5(a)

shows the effect of optimal antidote data on per-user RMSEs. The figure demonstrates a number of points. First, adding antidote data results in a model with less variation in per-user RMSE of rating estimations in both training and test sets. Second, a noticeable way in which adding antidote data improves fairness is by reducing the magnitude of the outliers that drive unfairness in both training and testing. Finally, the figure shows that in this example adding antidote data actually improves overall accuracy of the model predictions.

Figure 5(b) shows the effect of optimal antidote data on per-group RMSE in the test set. For each group (genre) of movies, the corresponding point shows the group’s RMSE before and after adding antidote data. Additionally, the boxplots on each axis illustrate the distribution of RMSE values across groups before and after adding antidote data.

First, we observe that all points are below the line , i.e., adding antidote data improves the prediction accuracy of all genres and thus the overall accuracy of the model. Moreover, the boxplots show that improvements in rating estimations are so that the cross-group variability in RMSE is decreased to reach a fairer situation. Finally, we see that outliers particularly benefit from addition of antidote data; this can be seen as larger RMSE improvements in genres that initially had larger RMSE. In particular, Documentary and Horror have the largest prediction errors before adding antidote data, and their RMSEs are the most improved (furthest below ) after adding antidote data.

7. Conclusion

In this paper we propose a new strategy for improving the socially relevant properties of a recommender system: adding antidote data. We have presented an algorithmic framework for this strategy and applied it to a range of socially important objectives. Using this strategy, one does not need to modify the original system input data or the system’s algorithm. We show that the resulting framework can efficiently improve the polarization or fairness properties of a recommender system. We conclude that the developed framework can be a flexible and effective approach to addressing the social impacts of a recommender system.

Acknowledgements. This research was supported by NSF grants CNS-1618207, IIS-1421759, and a European Research Council (ERC) Advanced Grant for the project “Foundations for Fair Social Computing” (grant no. 789373).

Appendix A Derivation of the gradients

a.1. Derivation of

Let be the value of the objective function in (1). Assuming that the factorization algorithm finds a local optimum of , we have , which give us the following:


From the above equation we can show that the following formula for holds at a local optimum of :


Therefore, assuming that an infinitesimal change in only results in first order corrections we can write:


a.2. Gradients of the objective functions



where is the average estimated rating for item .

Individual fairness
For we have:




where is the average of user losses.

Group fairness
Assume is a function that maps each user/item to its group label. For we have:




where is the average of group losses.

Appendix B Heurisitc Algorithms

In this section we present the pseudocode of the heuristic algorithms introduced in section 6.2 for generating antidote data to improve individual and group fairness.

b.1. heuristic1

  1. [label=0.]

  2. Start from a single antidote user .

  3. Compute .

  4. Compute for each item using (9).

  5. If set Else set

  6. Copy times to generate for a given budget

b.2. heuristic2

  1. [label=0.]

  2. Compute and reshape it into an matrix .

  3. Set for each item using and the original factor

  4. If set Else set

  5. Copy times to generate for a given budget


  • (1)
  • Agarwal et al. (2018) Alekh Agarwal, Alina Beygelzimer, Miroslav Dudík, John Langford, and Hanna Wallach. 2018. A reductions approach to fair classification. arXiv preprint arXiv:1803.02453 (2018).
  • Barocas and Selbst (2016) Solon Barocas and Andrew D Selbst. 2016. Big data’s disparate impact. Cal. L. Rev. 104 (2016), 671.
  • Berk et al. (2017) Richard Berk, Hoda Heidari, Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, Seth Neel, and Aaron Roth. 2017. A convex framework for fair regression. arXiv preprint arXiv:1706.02409 (2017).
  • Beutel et al. (2017a) Alex Beutel, Jilin Chen, Zhe Zhao, and Ed H Chi. 2017a. Data decisions and theoretical implications when adversarially learning fair representations. arXiv preprint arXiv:1707.00075 (2017).
  • Beutel et al. (2017b) Alex Beutel, Ed H Chi, Zhiyuan Cheng, Hubert Pham, and John Anderson. 2017b. Beyond globally optimal: Focused learning for improved recommendations. In Proceedings of the 26th International Conference on World Wide Web. 203–212.
  • Biega et al. (2018) Asia J Biega, Krishna P Gummadi, and Gerhard Weikum. 2018. Equity of Attention: Amortizing Individual Fairness in Rankings. arXiv preprint arXiv:1805.01788 (2018).
  • Boyd and Crawford (2012) Danah Boyd and Kate Crawford. 2012. Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information, communication & society 15, 5 (2012), 662–679.
  • Burke et al. (2018) Robin Burke, Nasim Sonboli, and Aldo Ordonez-Gauger. 2018. Balanced Neighborhoods for Multi-sided Fairness in Recommendation. In Conference on Fairness, Accountability and Transparency. 202–214.
  • Calmon et al. (2017) Flavio Calmon, Dennis Wei, Bhanukiran Vinzamuri, Karthikeyan Natesan Ramamurthy, and Kush R Varshney. 2017. Optimized pre-processing for discrimination prevention. In Advances in Neural Information Processing Systems. 3992–4001.
  • Celis et al. (2016) L Elisa Celis, Amit Deshpande, Tarun Kathuria, and Nisheeth K Vishnoi. 2016. How to be Fair and Diverse? arXiv preprint arXiv:1610.07183 (2016).
  • Chaney et al. (2017) Allison JB Chaney, Brandon M Stewart, and Barbara E Engelhardt. 2017. How Algorithmic Confounding in Recommendation Systems Increases Homogeneity and Decreases Utility. arXiv preprint arXiv:1710.11214 (2017).
  • Corbett-Davies et al. (2017) Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. 2017. Algorithmic decision making and the cost of fairness. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 797–806.
  • Dandekar et al. (2013) Pranav Dandekar, Ashish Goel, and David T. Lee. 2013. Biased assimilation, homophily, and the dynamics of polarization. Proceedings of the National Academy of Sciences 110, 15 (2013), 5791–5796.
  • Dwork et al. (2012) Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference. ACM, 214–226.
  • Hannak et al. (2013) Aniko Hannak, Piotr Sapiezynski, Arash Molavi Kakhki, Balachander Krishnamurthy, David Lazer, Alan Mislove, and Christo Wilson. 2013. Measuring personalization of web search. In Proceedings of the 22nd international conference on World Wide Web. ACM, 527–538.
  • Hardt (2014) Moritz Hardt. 2014. Understanding alternating minimization for matrix completion. In Foundations of Computer Science (FOCS), 2014 IEEE 55th Annual Symposium on. IEEE, 651–660.
  • Hardt et al. (2016) Moritz Hardt, Eric Price, Nati Srebro, et al. 2016.

    Equality of opportunity in supervised learning. In

    Advances in neural information processing systems. 3315–3323.
  • Harper and Konstan (2016) F Maxwell Harper and Joseph A Konstan. 2016. The movielens datasets: History and context. ACM Transactions on Interactive Intelligent Systems (TiiS) 5, 4 (2016), 19.
  • Hastie et al. (2015) Trevor Hastie, Rahul Mazumder, Jason D Lee, and Reza Zadeh. 2015. Matrix completion and low-rank SVD via fast alternating least squares. The Journal of Machine Learning Research 16, 1 (2015), 3367–3402.
  • Huang et al. (2011) Ling Huang, Anthony D Joseph, Blaine Nelson, Benjamin IP Rubinstein, and JD Tygar. 2011. Adversarial machine learning. In

    Proceedings of the 4th ACM workshop on Security and artificial intelligence

    . ACM, 43–58.
  • Kamiran and Calders (2012) Faisal Kamiran and Toon Calders. 2012. Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems 33, 1 (2012), 1–33.
  • Kamishima and Akaho (2017) Toshihiro Kamishima and Shotaro Akaho. 2017. Considerations on Recommendation Independence for a Find-Good-Items Task. In FATREC Workshop on Responsible Recommendation Proceedings.
  • Kamishima et al. (2012) Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, and Jun Sakuma. 2012. Enhancement of the Neutrality in Recommendation.. In Decisions@ RecSys. 8–14.
  • Kamishima et al. (2018) Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, and Jun Sakuma. 2018. Recommendation Independence. In Proceedings of the 1st Conference on Fairness, Accountability and Transparency, Vol. 81. PMLR, New York, NY, USA, 187–201.
  • Kamishima et al. (2011) Toshihiro Kamishima, Shotaro Akaho, and Jun Sakuma. 2011. Fairness-aware learning through regularization approach. In Data Mining Workshops (ICDMW), 2011 IEEE 11th International Conference on. IEEE, 643–650.
  • Li et al. (2016) Bo Li, Yining Wang, Aarti Singh, and Yevgeniy Vorobeychik. 2016. Data poisoning attacks on factorization-based collaborative filtering. In Advances in neural information processing systems. 1885–1893.
  • Matakos et al. (2017) Antonis Matakos, Evimaria Terzi, and Panayiotis Tsaparas. 2017. Measuring and moderating opinion polarization in social networks. Data Mining and Knowledge Discovery 31, 5 (2017), 1480–1505.
  • Mei and Zhu (2015) Shike Mei and Xiaojin Zhu. 2015. Using Machine Teaching to Identify Optimal Training-Set Attacks on Machine Learners.
  • Nicas (2018) Jack Nicas. 2018. How YouTube Drives People to the Internet’s Darkest Corners. Wall Street Journal (2018).
  • O’Callaghan et al. (2015) Derek O’Callaghan, Derek Greene, Maura Conway, Joe Carthy, and Pádraig Cunningham. 2015. Down the (White) Rabbit Hole: The Extreme Right and Online Recommender Systems. Social Science Computer Review 33, 4 (2015), 459–478.
  • Pariser (2011) E. Pariser. 2011. The Filter Bubble: What the Internet Is Hiding from You. Penguin Group USA.
  • Singh and Joachims (2018) Ashudeep Singh and Thorsten Joachims. 2018. Fairness of Exposure in Rankings. arXiv preprint arXiv:1802.07281 (2018).
  • Wang and Gong (2018) Binghui Wang and Neil Zhenqiang Gong. 2018. Stealing hyperparameters in machine learning. arXiv preprint arXiv:1802.05351 (2018).
  • Yao and Huang (2017) Sirui Yao and Bert Huang. 2017. Beyond Parity: Fairness Objectives for Collaborative Filtering. In Advances in Neural Information Processing Systems. 2925–2934.
  • Zafar et al. (2017a) Bilal Zafar, Isabel Valera, Manuel Gomez-Rodriguez, and Krishna P Gummadi. 2017a.

    Training fair classifiers. In

    AISTATS’17: 20th International Conference on Artificial Intelligence and Statistics.
  • Zafar et al. (2017b) Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P Gummadi. 2017b. Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In Proceedings of the 26th International Conference on World Wide Web. 1171–1180.
  • Zafar et al. (2017c) Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rogriguez, and Krishna P Gummadi. 2017c. Fairness Constraints: Mechanisms for Fair Classification. In Artificial Intelligence and Statistics. 962–970.
  • Zehlike et al. (2017) Meike Zehlike, Francesco Bonchi, Carlos Castillo, Sara Hajian, Mohamed Megahed, and Ricardo Baeza-Yates. 2017. Fa* ir: A fair top-k ranking algorithm. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, 1569–1578.
  • Zemel et al. (2013) Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. 2013. Learning fair representations. In International Conference on Machine Learning. 325–333.