# Individual Fairness Revisited: Transferring Techniques from Adversarial Robustness

We turn the definition of individual fairness on its head—rather than ascertaining the fairness of a model given a predetermined metric, we find a metric for a given model that satisfies individual fairness. This can facilitate the discussion on the fairness of a model, addressing the issue that it may be difficult to specify a priori a suitable metric. Our contributions are twofold: First, we introduce the definition of a minimal metric and characterize the behavior of models in terms of minimal metrics. Second, for more complicated models, we apply the mechanism of randomized smoothing from adversarial robustness to make them individually fair under a given weighted L^p metric. Our experiments show that adapting the minimal metrics of linear models to more complicated neural networks can lead to meaningful and interpretable fairness guarantees at little cost to utility.

## Authors

• 6 publications
• 23 publications
06/19/2020

### Two Simple Ways to Learn Individual Fairness Metrics from Data

Individual fairness is an intuitive definition of algorithmic fairness t...
06/01/2019

### Metric Learning for Individual Fairness

There has been much discussion recently about how fairness should be mea...
06/21/2020

### Verifying Individual Fairness in Machine Learning Models

We consider the problem of whether a given decision model, working with ...
07/21/2021

### Leave-one-out Unfairness

We introduce leave-one-out unfairness, which characterizes how likely a ...
08/24/2019

### Fairness Warnings and Fair-MAML: Learning Fairly with Minimal Data

In this paper, we advocate for the study of fairness techniques in low d...
06/23/2020

### Fair Performance Metric Elicitation

What is a fair performance metric? We consider the choice of fairness me...
11/03/2020

### Quadratic Metric Elicitation with Application to Fairness

Metric elicitation is a recent framework for eliciting performance metri...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

When machine learning models are deployed to make predictions about people, it is important that the model treats individuals fairly.

Individual fairness [dwork2012fairness] captures the notion that similar people should be treated similarly by imposing a continuity requirement on models. However, this raises the difficult societal question of how to define which people are “similar”.

We start in Section 3 from the insight that it may be easier to determine whether a given similarity metric is reasonable than it is to construct one from scratch. Thus, rather than imposing individual fairness with a predetermined similarity metric, we find a metric that corresponds to the behavior of a given model, which can then guide the discussion on whether the model is fair. To facilitate this, we introduce the notion of a minimal fairness metric, and show that in many cases there exists a unique metric that best characterizes the behavior of a given model for this purpose.

In Section 4, we deal with more complicated models, such as deep neural networks, whose minimal metrics are not easily computable. We show that we can make any model provably individually fair by post-processing it with randomized smoothing [cohen2019certified] to impose a given weighted metric. As randomized smoothing was originally applied as a defense against adversarial examples, our result brings to light the connection between individual fairness and adversarial robustness. However, our theorems are in a sense stronger because individual fairness is a uniform requirement that applies to all points in the input space, whereas the certified threshold of Cohen et al. is a function of the input point. Our Laplace and Gaussian smoothing mechanisms are versatile in that they can make a model provably individually fair under any given weighted metric, and we show the minimality of this metric for the smoothed model to argue that we do not add more noise than is necessary.

Finally, our experiments combine the two main elements of our paper—we smooth neural networks to be individually fair under a metric that is proportional to the minimal metrics of linear models trained on the same datasets. Our results on four real datasets show that the neural networks smoothed with Gaussian noise in particular are often approximately as accurate as the original models. Moreover, we can achieve models with similar favorable individual fairness guarantees to those of linear models while still enjoying the increased predictive accuracy enabled by the neural network.

#### Related Work.

dwork2012fairness [dwork2012fairness] introduced the definition of individual fairness, which contrasts with group-based notions of fairness [hardt2016equality, zafar2017fairness-www] that require demographic groups to be treated similarly on average. Motivated in part by group fairness, zemel2013learning [zemel2013learning] learn a representation of the data that excludes information about a protected attribute, such as race or gender, whose use is often legally prohibited. This work has spurred more research on fair representations [calmon2017optimized, madras2018learning, tan2019learning], and the resulting representations implicitly define a similarity metric. However, unlike the weighted metrics that we use, these metrics are harder for humans to interpret and are primarily designed to attain group fairness.

Others approximate individual fairness based on a limited number of oracle queries, which represent human judgments, about whether pair of individuals is similar. gillen2018online [gillen2018online] attempt to learn a similarity metric that is consistent with the human judgments in the setting of online linear contextual bandits. In a more general setting, ilvento2019metric [ilvento2019metric] derives an approximate metric using comparison queries that ask which of two individuals a given third individual is more similar to. Finally, jung2019eliciting [jung2019eliciting] apply constrained optimization directly without assuming that the human judgments are consistent with a metric.

By contrast, we post-process a model using randomized smoothing to provably ensure individual fairness. cohen2019certified [cohen2019certified] previously analyzed randomized smoothing in the context of adversarial robustness. In the context of fairness, most post-processing approaches [hardt2016equality, canetti2019soft] do not take individual fairness into account, and although lohia2019bias [lohia2019bias] consider individual fairness, they define two individuals to be similar if and only if they differ only in the pre-specified protected attribute.

## 2 Background

In this section, we present the definitions and notation that we will use throughout the paper.

###### Definition 1 (Distance metric).

A nonnegative function is a distance metric in if it satisfies the following three conditions: nonnegativity, symmetry, and triangle inequality.

In common mathematical usage, Definition 1 is a pseudometric, and metrics must also satisfy the condition that if and only if . However, throughout this paper we will refer to pseudometrics as metrics, following the convention in the field of metric learning.

One commonly used family of metrics is the standard metric, which is defined over . In this paper, we consider a more general family of metrics that allows each coordinate to be weighted differently.

###### Definition 2 (Weighted Lp metric).

The weighted metric, with and weights , is a distance metric in that is defined by the equation

 D(x1,x2)=p√∑di=1wi⋅|x1i−x2i|p, (1)

where and are the -th coordinates of and , respectively.

We place the restriction that because otherwise the function does not satisfy the triangle inequality. When for all , we have the standard metric.

Throughout this paper, we will use and to denote a model’s input and output spaces, respectively. Moreover, we will assume a distance metric that characterizes how close two points in the output space are.

###### Definition 3 (Individual fairness [dwork2012fairness]).

A model is individually fair under metric if, for all ,

 DY(h(x1),h(x2))≤DX(x1,x2). (2)

Individual fairness captures the intuition that the model should not behave arbitrarily. In particular, it formalizes the notion that similar individuals should be treated similarly, i.e., given two individuals , if the distance between them is small, then the distance between the outputs of the model on these individuals should also be small.

## 3 Minimal Distance Metric

One criticism of individual fairness is that it is difficult to apply in practice because it requires one to specify the metric  [chouldechova2018frontiers]. The choice of a metric in dictates which individuals should be considered similar, which is highly context-dependent and often controversial. Thus, we take a slightly different approach—rather than specifying a metric and asking whether a model is individually fair under that metric, we find one metric under which the model is individually fair. Then, we can reason about whether the metric is appropriate for the task at hand.

However, there could be multiple metrics for which a model is individually fair. In fact, if for all , then any model that is individually fair under is also fair under , as the metrics are simply upper bounds on the extent to which a model’s outputs can vary. On the other hand, our goal is to characterize the behavior of a model, for which we need a tight upper bound. This notion of tightness is captured by the minimality of a distance metric, defined in Definition 4.

###### Definition 4 (Minimal distance metric).

Let be a set of distance metrics in . A metric is minimal in with respect to model if (1) is individually fair under , and (2) there does not exist a different such that is individually fair under and for all .

To see how one may reason about the minimal distance metric, consider a hiring model with a binary output that informs whether a given applicant should be hired. A natural in this setting is the 0-1 loss . Then, if the hiring model satisfies individual fairness under a metric such that whenever and differ only in race, we can reason that it does not directly use race to discriminate.

We now present Theorem 1, which identifies the unique minimal metric among the set of all metrics.

###### Theorem 1.

Let be a model, and let be the set of all metrics that satisfy the conditions in Definition 1. Then, the metric , defined as

 DX(x1,x2)=DY(h(x1),h(x2)) (3)

for all , is the unique minimal metric in with respect to .

###### Proof.

We first prove that is a minimal metric, and later we will prove that no other metric is minimal. Since we assume to be a metric, it easily follows that is also a metric under Definition 1. Moreover, the equality in Eq. 2 always holds by our definition of , so is individually fair under . Thus, it remains to show that there does not exist a different such that is individually fair under and for all .

Suppose such exists. Since , there must exist some such that . Combining this with Eq. 3, we get

 DY(h(x1),h(x2))=DX(x1,x2)>D′X(x1,x2), (4)

which contradicts our assumption that is individually fair under .

Now we prove that is the unique minimal metric, arguing that cannot be minimal if . If there exist such that , then is not individually fair under by Eq. 4. Otherwise, we have for all and is individually fair under , so cannot be a minimal metric. ∎

Theorem 1 shows that the minimal metric in is defined directly in terms of the model in question. Ideally, we want the minimal metric to be simpler than the model so that it can help us interpret and reason about the fairness of the model. Thus, in the rest of this paper we only consider weighted metrics, which comprise a broad and interpretable family of metrics defined over .

With this set of metrics, we can no longer prove a theorem as general as Theorem 1

, so we now prove a result for linear regression models. In this setting, we have

, and the distance metric is simply the absolute value . Theorem 2 identifies the weighted metric that is uniquely minimal for a given linear regression model.

###### Theorem 2.

Let be a linear regression model with coefficients , and let be the set of all weighted metrics. Then, the metric with weights is the unique minimal metric in with respect to .

###### Proof.

is clearly in by definition. To see that is individually fair under , note that for all

 DY(h(x1),h(x2)) =|h(x1)−h(x2)| (5) =|∑di=1βi(x1i−x2i)| ≤∑di=1|βi(x1i−x2i)| =DX(x1,x2).

The rest of the proof closely mirrors the argument given in the proof of Theorem 1, so we only mention how the proofs differ. As in the proof of Theorem 1, we assume that there exists such that . Our goal is to show that is not individually fair, and for this proof we have the additional condition that . However, it is not necessarily true that , so we instead construct such that .

Let . With Eq. 1, we can verify that for any . Moreover, for all , so the equality in Eq. 5 holds if we replace by . Combining all of these relations, we arrive at the desired result:

 D′X(x1,x′2)=D′X(x1,x2)

## 4 Randomized Smoothing

For settings without a simple linear relation between the inputs and the outputs, neural networks often replace linear models. However, as evidenced by adversarial examples [szegedy2014intriguing, goodfellow2015explaining], neural networks are unlikely to be individually fair under distance metrics of reasonable size. Adversarial examples are inputs to the model that are created by applying a small perturbation to an original input with the goal of causing a very large change in the model’s output. These attacks are usually successful, showing a small change in can cause a large change in , which is contrary to individual fairness.

Previously, cohen2019certified [cohen2019certified] introduced randomized smoothing, a post-processing method that ensures that the post-processed model is robust against perturbations of size, measured with the standard norm, up to a threshold that depends on the input point. In this section, for any given metric, we apply a modified version of randomized smoothing and prove that the resulting model is individually fair under that metric. We note that this result does not immediately follow from prior results—individual fairness imposes the same constraint on every point in the input space, whereas the certified threshold of Cohen et al. is a function of the input point.

In the rest of this paper, we assume that is categorical, following the setting of cohen2019certified [cohen2019certified]. In this section, we present and prove two methods for deriving an individually fair model from an arbitrary function . Like the models considered by dwork2012fairness [dwork2012fairness], our fair model maps to

, which is the set of probability distributions over

. It is important to note that is deterministic and that we treat its output simply as an array of probabilities. To avoid confusion with the randomness that we introduce in Section 5, we will write to denote the probability .

###### Definition 5 (Randomized smoothing).

Let be an arbitrary model, and let be a probability distribution111We abuse notation and use

to denote both the distribution and its probability density function.

. Then, the smoothed model is defined by

 hf,g(x)[y]=∫Rd1[f(x+t)=y]⋅g(t)dt (6)

for all , and is called the smoothing distribution.

Intuitively, is the original model, and the value of the smoothed model at is found by querying on points around . We choose the points around according to the distribution , and the output of the smoothed model is a probability distribution of the values of at the queried points. To reason about the individual fairness of , we use the total variation distance (Eq. 7) to define the distance between probability distributions.

 DΔ(Y)(Y1,Y2)=12∑y∈Y|Y1[y]−Y2[y]|. (7)

### 4.1 Laplace Smoothing Distribution

One main difference between this setting and that in Section 3 is that we have a choice of the smoothing distribution . Thus, instead of simply finding a metric under which the model is individually fair, we adapt the smoothing distribution to a given metric. Theorem 3 shows that, for any weighted metric , there exists a smoothing distribution that guarantees that is individually fair under for all .

###### Theorem 3 (Laplace smoothing).

Let be the set of all weighted metrics. For any , let , where is the normalization factor . Then, is individually fair under for all .

###### Proof.

We will show that for all .

First, since for all weighted metric , we have

 DX(0,t)−DX(0,ϵ)≤DX(0,t−ϵ)≤DX(0,t)+DX(0,ϵ) (8)

by the triangle inequality. We can apply the first inequality in Eq. 8 to bound the probability in terms of .

 g(t−ϵ) =exp(−2DX(0,t−ϵ))/Z ≥exp(−2[DX(0,t)+DX(0,ϵ)])/Z =g(t)/exp(−2DX(0,ϵ))

Then, for all we have

 hf,g(x+ϵ)[y] (9) =∫Rd1[f(x+ϵ+t)=y]⋅g(t)dt =∫Rd1[f(x+t)=y]⋅g(t−ϵ)dt ≥∫Rd1[f(x+t)=y]⋅g(t)/exp(−2DX(0,ϵ))dt =hf,g(x)[y]/exp(−2DX(0,ϵ)).

Similarly, we can apply the second inequality in Eq. 8 to derive the upper bound

 (10)

We can now apply a previous result by kairouz2016extremal [[Theorem 6]kairouz2016extremal] to determine the maximum distance between and that is attainable with the above constraints. For brevity, let denote . In the context of -local differential privacy, Kairouz et al. showed that the maximum possible total variation distance is . Replacing with , we see that the distance between and is at most .

Finally, it remains to be proven that this quantity is not more than , which is equivalent to for weighted metrics. Since this distance can be written as , it suffices to show that for all . This inequality follows from the fact that equality holds at and that the derivative of the right-hand side is never less than that of the left-hand side for . ∎

Although Theorem 3 identifies a smoothing distribution that ensures the individual fairness of the resulting model under , we also want the smoothed model to retain the utility of the original model . In the extreme case where

is the uniform distribution over

, the resulting model will be a constant function and therefore satisfy individual fairness under any metric, but it will not be very useful for classification tasks. More generally, smoothed models that are individually fair under smaller distance metrics tend to not preserve as much locally relevant information about . Thus, we can argue that a smoothing distribution does not unnecessarily lower the model’s utility by showing that is minimal. The definition of minimality that we use here differs from Definition 4 in that the smoothed model must be individually fair for all .

###### Definition 6 (Minimal distance metric, smoothing).

Let be a set of distance metrics in . A metric is minimal in with respect to a smoothing distribution if (1) is individually fair under for all , and (2) there does not exist a different such that is individually fair under for all and for all .

For general weighted metrics, the inequalities in Eq. 8 are strict for most , so the bounds in Eqs. 10 and 9 are not tight, and is not guaranteed to be minimal. On the other hand, if is a weighted metric, Theorem 4 shows that it is minimal with respect to its Laplace smoothing distribution.

###### Theorem 4.

Let be the set of all weighted metrics. For any , let , where is the normalization factor . Then, is uniquely minimal in with respect to .

###### Proof.

We have already proven in Theorem 3 that is individually fair under for all . It remains to show that there does not exist a different such that is individually fair under for all and for all .

Let and be the weights of and , respectively. If for all , we must have for all . Moreover, since , there exists such that . We now construct such that is not individually fair under .

Let be a function such that . We will show that there exists such that , where

is the basis vector that is one in the

-th coordinate and zero in all others. Applying Eq. 6 and simplifying, we get and . Therefore, the distance is . Moreover, we have . The ratio approaches as , so when is sufficiently small, we have .

Uniqueness follows from the argument given in the last paragraph of the proof of Theorem 1. ∎

### 4.2 Gaussian Smoothing Distribution

As we show in Section 6, in practice Laplace smoothing distributions do not preserve well the utility of due to their relatively high densities at the tails. Thus, we present Gaussian smoothing as an alternative, which Theorem 5 shows is individually fair under any weighted metric. Since for any weighted and metrics and with the same weights, we can then scale the weights accordingly to make fair under any given metric. For simplicity, we only consider the setting of binary classification, i.e., .

###### Theorem 5 (Gaussian smoothing).

Let be a weighted metric with weights , and let be a diagonal matrix with . If is Gaussian with mean

and variance

, is individually fair under for all .

To prove this theorem, we will apply the Neyman–Pearson lemma [neyman1933problem], as formulated by cohen2019certified [[Lemma 3]cohen2019certified].

###### Lemma 6 (Neyman–Pearson).

Let and

be random variables in

with densities and , and let such that if and only if for some threshold . Then,

###### Proof of Theorem 5.

We proceed by showing that for all and . For any given , we will first find such that

 (11)

We will then show that is individually fair under , which together with Eq. 11 implies that is also individually fair under .

Fix , and assume without loss of generality that . Then, we have

 DΔ(Y)(hf,g(x),hf,g(x+ϵ))=hf,g(x+ϵ)[1]−hf,g(x)[1]. (12)

We apply Lemma 6 by choosing and such that . By Eq. 6, we have and , and similar relations hold between and . Therefore, if there exists that satisfies the condition in Lemma 6 such that , then . Combining these two (in)equalities, we get

 hf,g(x+ϵ)[1]−hf,g(x)[1]≤hf⋆,g(x+ϵ)[1]−hf⋆,g(x)[1],

and Eq. 11 follows from Eq. 12 and its counterpart.

We now show that it is possible to find such that . By construction, we have that if and only if

 μX2(t)μX1(t)=g(t−x−ϵ)g(t−x)≥k

for some . Substituting in the Gaussian density function and solving for , we see that this inequality holds whenever , where is a constant with respect to . When evaluating as per Eq. 6, is distributed normally, and therefore is also (univariate) Gaussian. Thus, we can obtain the necessary by picking the appropriate value of .

Finally, it remains to show that is individually fair under . Let , and let be the density function of . With some computation, we see that and differ if and only if . Moreover, since has variance , the variance of is , and thus the maximum value of is . We apply these two facts to arrive at the desired result:

 DΔ(Y)(hf⋆,g(x),hf⋆,g(x+ϵ)) =hf⋆,g(x+ϵ)[1]−hf⋆,g(x)[1] =∫Rd(f⋆(x+ϵ+t)−f⋆(x+t))⋅g(t)dt =∫Rd(f⋆(t+ϵ)−f⋆(t))⋅g(t−x)dt =∫κ+ϵTΣ−1ϵκγ(τ−ϵTΣ−1x)dτ ≤ϵTΣ−1ϵ⋅(2πϵTΣ−1ϵ)−1/2 =√∑di=1ϵ2i/(2πΣii) =√∑di=1ϵ2i⋅wi=DX(x,x+ϵ).\qed

We end this section with Theorem 7, which states that an metric is minimal with respect to its Gaussian smoothing distribution. We omit the proof since it is very similar to that of Theorem 4.

###### Theorem 7.

Let be the set of all weighted metrics. For any , let be the weights, and let be a diagonal matrix with . If is a Gaussian with mean and variance , is uniquely minimal in with respect to .

## 5 Practical Implementation

In practice, it is infeasible to compute because of the integral in Eq. 6. Therefore, to apply randomized smoothing in practice, we approximate the integral with Algorithm 1, i.e., by sampling points independently from the smoothing distribution , evaluating the model with this noise added to , and returning the observed probability of predicting each class on the sampled points. However, the resulting model may not be individually fair due to the finite sample size. Thus, we define and prove -individual fairness, which requires that the model be close to individually fair with high probability.

###### Definition 7 ((ϵ,δ)-individual fairness).

A randomized model is -individually fair under metric if, for all ,

 DΔ(Y)(h(x1),h(x2))≤DX(x1,x2)+ϵ (13)

with probability at least . The probability is taken over the randomness of .

###### Theorem 8.

Let be a model that approximates with samples. If and is fair under , then is -individually fair under .

###### Proof.

Consider any two points . Since is individually fair under , we have

 12∑y∈Y|hf,g(x1)[y]−hf,g(x2)[y]|=DΔ(Y)(hf,g(x1),hf,g(x2))≤DX(x1,x2).

We will show that with probability less than , where . Then, by union bound, with probability at least we will have for all , which leads to our desired result.

 DΔ(Y)(hnf,g(x1),hnf,g(x2)) ≤12∑y∈Y|hf,g(x1)[y]−hf,g(x2)[y]|+12∑y∈Y|diff[y]| ≤DX(x1,x2)+ϵ

Fix , and let , where is the -th sample drawn from the smoothing distribution while evaluating . Then, and , so . The theorem follows from Hoeffding’s inequality.

 Pr[|diff|>2ϵ/m] =Pr[|12n∑nj=1(X1j−X2j−E[X1j−X2j])|>ϵ/m] <2e−4nϵ2/m2\qed

### 5.1 Noise Sampling

Implementations of Gaussian noise sampling are commonly included in data analysis libraries. For Laplace noise sampling, we apply Algorithm 2, which describes how to sample a point from the Laplace smoothing distribution when . Without loss of generality, we assume that is a standard metric since we can simply rescale each coordinate by its weight . Recall that . When , this quantity becomes , so each coordinate can be sampled from the Laplace distribution independently of the others. For other values of , the coordinates are not independent, so we instead sample the distance and then pick a point uniformly at random on the sphere () or hypercube () of radius .

To sample , we note that the set has surface area proportional to . Hence, the probability of drawing a point in this set from the distribution is proportional to

, and the cumulative distribution function of

is , where is the regularized lower incomplete gamma function. Finally, computing the inverse of this function allows us to sample through inverse transform sampling.