1 Introduction
Deep Learning’s success is made possible in part due to the availability of big datasets – distributed across several owners. To resolve this, researchers propose Federated Learning (FL), which enables parallel training of a unified model [18]. Such a requirement naturally arises for mobile phones, network sensors, and other IoT applications. In FL, these respective owners are referred to as ‘clients’ (henceforth agents). The agents individually train a model on their private data. A ‘central server,’ referred henceforth as an aggregator
, receives the individual models and computes a single overall model through different heuristics for achieving high performance on any test data
[28].The data available with each agent is often imbalanced or biased. Machine learning models may further amplify the bias present. More concretely, when trained only for achieving high accuracy, the model predictions become highly biased towards certain demographic groups like gender, age, or race [4, 5, 7]. Such groups are known as the sensitive attributes. Post the impossibility result on achieving a perfectly unbiased model [7], researchers propose several approaches which focus on minimizing the bias while maintaining high accuracy [6, 17, 2, 23].
Invariably all these approaches require the knowledge of the sensitive attribute. These attributes often comprise the most critical information. The law regulations at various places prohibit using such attributes to develop ML models. E.g., the EU General Data Protection Regulation prevents the collection of sensitive user attributes [26]. Thus, it is imperative to address discrimination while preserving the leakage of sensitive attributes from the data samples.
Observing that the aggregator in FL has no direct access to private data or sensitive attributes, prima facie preserves privacy. However, there exist several attacks that highlight the information leak in an FL setting [20]. To plug this information leak, researchers either use cryptographic solutions based mainly on complex Partial Homomorphic Encryption (PHE) or use Differential Privacy (DP). Private FL solutions using PHE (e,g., [19, 30, 31, 12]) suffer from computational inefficiency and postprocessing attacks. Thus, in this work we focus on the strong privacy guarantees provided by a differentially private solution [24, 25, 22, 29].
FPFL Framework. A first, we incorporate both fairness and privacy guarantees for an FL setting through our novel framework FPFL: Fair and Private Federated Learning. Our primary goal is to simultaneously preserve the privacy of the training data and the sensitive attribute while ensuring fairness. With FPFL, we achieve this goal by ingeniously decoupling the training process in two phases. In Phase 1, each agent trains a model, on its private dataset, for fair predictions. Then in Phase 2, the agents train a differentially private model to mimic the fair predictions from the previous model. At last, each agent communicates the private model from Phase 2 to the aggregator.
Fairness and Privacy Notions. The deliberate phasewise training ensures an overall fair and accurate model which does not encode any information related to the sensitive attribute. Our framework is general and absorbs any fairness or DP metrics. It also allows any fairness guaranteeing technique in Phase 1. In this paper we demonstrate FPFL’s efficacy w.r.t. the stateoftheart technique and the following notions.

Fairness: We consider demographic parity (DemP) and
equalized odds
(EO). DemP states that a model’s predictions are independent of a sensitive attribute of the dataset [10]. EO states that the false positive rates and false negative rates of a model are equal across different groups or independent of the sensitive attribute [14]. 
Privacy: We quantify the privacy guarantees within the notion of local differential privacy [11]. The notion of localDP is a natural fit for FL. In our setting, the aggregator acts as an adversary with access to each agent’s model. We show that with localDP, the privacy of each agent’s training data and sensitive attribute is protected from the aggregator.
Empirical interplay between accuracy, fairness, and privacy. The authors in [3] were one of the first to show that ensuring privacy may come at a cost to fairness. While the tradeoff between fairness and privacy in ML is underexplored, Table 1 presents the existing literature. Apart from those, the authors in [16] consider accuracy parity which is weaker than EO in a nonprivate FL setting.
Contributions. In summary, we propose our novel FPFL framework (Fig. 2). We prove that FPFL provides the localDP guarantee for both the training data and the sensitive attribute(s) (Proposition 1). Our experiments on the Adult, Bank, and Dutch datasets show the empirical tradeoff between fairness, privacy, and accuracy of an FL model (Section 4).
2 Preliminaries
We consider a binary classification problem with as our (dimensional) instance space, ; and our output space as . We consider a single sensitive attribute associated with each individual instance. Such an attribute may represent sensitive information like age, gender or caste. Each represents a particular category of the sensitive attribute like male or female.
Federated Learning Model.
Federated Learning (FL) decentralizes the classical machine learning training process. FL comprises two type of actors: (i) a set of agents where each agent owns a private dataset ^{1}^{1}1Let denote the cardinality of with .^{2}^{2}2We use the subscript “” when referring to a particular agent and drop it when not referring to any particular agent.; and (ii) an Aggregator. Each agent provides its model, trained on its dataset, to the aggregator. The aggregator’s job then is to derive an overall model, which is then communicated back to the agents. This backandforth process continues until a model with sufficient accuracy is derived.
At the start of an FL training, the aggregator communicates an initial, often random, set of model parameters to the agents. Let us refer to the initial parameters as . At each timestep each agent updates their individual parameters denoted by , using their private datasets. The agents then communicate the updated parameters to the aggregator, who derives an overall model through different heuristics [28]. In this paper, we focus on the weighted sum heuristics, i.e., the overall model parameters take the form . To distinguish the final overall model, we refer to it as , calculated at a timestep .
Fairness Metrics.
We consider the following two fairness constraints.
Definition 1 (Demographic Parity (DemP))
Given that , we have
Definition 2 (Equalized Odds (EO))
A classifier satisfies Equalized Odds under a distribution over if its predictions are independent of the sensitive attribute given the label . That is,
Given that , we can say
Local Differential Privacy (LDP).
We now define LDP in the context of our FL model. We remark that LDP does not require defining adjacency.
Definition 3 (Local Differential Privacy (LDP) [11])
For an input set and the set of noisy outputs , a randomized algorithm is said to be LDP if and the following holds,
(1) 
LDP provides a statistical guarantee against an inference which the adversary can make based on the output of . This guarantee is upperbounded by , referred to as the privacy budget. is a metric of privacy loss defined as,
(2) 
The privacy budget, , controls the tradeoff between quality (or, in our case, the accuracy) of the output visavis the privacy guarantee. That is, there is no “freedinner” – lower the budget, better the privacy but at the cost of quality. The “” parameter in (1) allows for the violation of the upperbound
, but with a small probability.
Differentially private ML solutions focus on preserving an individual’s privacy within a dataset. Such privacy may be compromised during the training process or based on the predictions of the trained model [13]. The most famous of such an approach is the DPSGD algorithm, introduced in [1]
. In DPSGD, the authors sanitize the gradients provided by the Stochastic Gradient Descent (SGD) algorithm with
Gaussian noise (). This step aims at controlling the impact of the training data in the training process.Adversary Model.
Towards designing a private FL system, it suffices to provide DP guarantees for any possible information leak to the aggregator. The postprocessing properties of DP further preserves the DP guarantee for the training data and the sensitive attribute from any other party.
We consider the “blackbox” model for our adversary, i.e., the aggregator has access to the trained model and can interact with it via inputs and outputs. With this, it can perform modelinversion attacks [13], among others.
3 FPFL: Fair and Private Federated Learning
In FPFL (Figure 2), we consider a classification problem. Each agent deploys two multilayer neural networks (NNs) to learn the model parameters in each phase. The training comprises of two phases: (i) In Phase 1, each agent privately trains a model on its private dataset to learn a highly fair and accurate model; and (ii) In Phase 2, each agent trains a second model to mimic the first, with DP guarantees. This process is akin to knowledge distillation [15]. In FPFL, only the model trained in Phase 2 is broadcasted to the aggregator.
To enhance readability and to remain consistent with FL notations, we denote the model parameters learned for Phase 1 with and Phase 2 with . Likewise, we represent the total number of training steps in Phase 1 with , and for Phase 2, we use .
Phase 1: FairSGD. In this phase, we train the network to maximize accuracy while achieving the best possible fairness on each agent’s private dataset. We adapt the Lagrangian Multiplier method [23] to achieve a fair and accurate model. We denote the model for agent as with parameters . Briefly, the method trains a network with a unified loss that has two components. The first component of the loss maximizes accuracy, i.e., the crossentropy loss,
The second component of the loss is a specific fairness measure. For achieving DemP (Definition 1
), the loss function is given by,
(3) 
For achieving EO (Definition 2), the corresponding loss function is,
(4) 
Hence, the overall loss from the Lagrangian method is,
(5) 
In the equation above, is the Lagrangian multiplier. The overall optimization is as follows: . Thus, each agent trains the FairSGD model to obtain the best accuracy w.r.t. a given fairness metric. We present it formally in Algorithm 1.
Phase 2: DPSGD. In this phase, the agents train a model that is communicated with the aggregator. This model denoted by is trained by each agent to learn the predictions of its own FairSGD model () from Phase 1. The loss function is given by,
(6) 
Equation 6 is the crossentropy loss between predictions from DPSGD model and the labels given by the predictions from FairSGD model. That is,
To preserve privacy of training data and sensitive attribute, we use LDP (Definition 3). In particular, we deploy DPSGD. In it, the privacy of the training data is preserved by sanitizing the gradients provided by SGD with Gaussian noise (). Given that the learnt model , mimics , the learnt model is reasonably fair and accurate. Algorithm 2 formally presents the training.
FPFL Framework. The ’s from each agent are communicated to the aggregator for further performance improvement. The aggregator takes a weighted sum of the individual ’s and broadcasts it to the agents. The agents further train on top of the aggregated model before sending it to the aggregator. This process gets repeated to achieve the following overall objective,
(7) 
We now formally couple these processes above to present the FPFL framework with Figure 2. The framework presents itself as a plugandplay system, i.e., a user can use any other loss function instead of , or change the underlying algorithms for fairness and DP, or do any other tweak it so desires.
FPFL: Differential Privacy Bounds. Observe that the model learned in Phase 1, , requires access to both the training data () and the sensitive attribute (). Fortunately, this phase is entirely independent of the FL aggregation process. In contrast, the model learned in Phase 2, – trained to mimic the predictions of – is communicated to the aggregator.
Any information leak in FPFL may take place in the following two ways. Firstly, training data may get compromised through . Secondly, mimicking the predictions from may, in turn, leak information about the sensitive attribute. We observe that the DP guarantee for the training data follows from [1, Theorem 1] directly. The following proposition proves that the training process in Phase 2 does not leak any additional information regarding to the aggregator. Then, Corollary 1 uses the result with [1, Theorem 1] to provide the privacy bounds.
Proposition 1
With the differentially private FPFL framework (Figure 2), the aggregator with access to the model learns no additional information, over the DP guarantee, regarding the sensitive attribute .
Corollary 1
For the FPFL framework (Figure 2), there exists constants and , with the sampling probability and the total number of timesteps in Phase 2, such that for any , the framework satisfies LDP for and for
4 FPFL: Experimental Results
Datasets.
We conduct experiments on the following three datasets: Adult [23], Bank [23] and Dutch [32]. The first two have samples, while the Dutch dataset has . In the Adult dataset, the task is a binary prediction of whether an individual’s income is above or below USD 50000. The sensitive attribute is gender and is available as either male or female. In the Bank dataset, the task is to predict if an agent has subscribed to the term deposit or not. In this case, we consider age as the sensitive attribute. We consider only two categories for age, where people between the ages 25 to 60 form the majority group, and those under 25 or over 60 form the minority group. In the Dutch dataset, similar to Adult, we consider gender as the sensitive attribute comprising males and females. The task is to predict the occupation. For training an FL model, we split the datasets such that each agent has an equal number of samples. In order to do so, we duplicate the samples in the existing data – especially the minority group – to get exactly samples for the first two datasets. Despite this, each agent ends up with an uneven distribution of samples belonging to each attribute maintaining the data heterogeneity. This results in heterogeneous data distribution among the agents. We hold of the data from each dataset as the test set.
Hyperparameters.
For each agent, we train two fully connected neural networks having the same architecture. Each network has two hidden layers with neurons and ReLU activation. For DemP, we consider 5 agents in our experiments and split datasets accordingly. To estimate EO, we need sufficient samples for both sensitive groups such that each group has enough samples with both the possible outcomes. In the Adult dataset, we find only female samples earning above USD 50000. Similarly, in the Bank dataset, the minority group that has subscribed to the term deposit forms only of the entire data. Due to this, in our experiments for EO, we consider only 2 agents.
Training FairSGD (Phase 1).
Training DPSGD (Phase 2).
We use Algorithm 2, with , , and the clipping norm
. For the optimizer we use the Tensorflowprivacy library’s Keras DPSGD optimizer
^{3}^{3}3https://github.com/tensorflow/privacy. We train the model in this phase for 5 epochs locally before aggregation. This process is repeated 4 times, i.e. .Baselines.
To compare the resultant interplay between accuracy, fairness, and privacy in FPFL, we create the following two baselines.

In this, the agents train the model in an FL setting only for maximizing accuracy without any fairness constraints in the loss.
For both B1 and B2, the final model obtained after multiple aggregations is used to report the results. These baselines maximize accuracy and ensure fairness without any privacy guarantee. Essentially, this lack of a privacy guarantee implies that for both baselines, we skip FPFL’s Phase 2.
bounds.
We calculate the bounds for an agent from Corollary 1. To remain consistent with the broad DPML literature, we vary in the range by appropriately selecting (noise multiplier). Observe that for B1 and B2 as the sensitivity is unbounded. As standard , we keep for DemP and for EO.
DemP and EO.
When the loss for DemP (3) and EO (4) is exactly zero, the model is perfectly fair. As perfect fairness is impossible, we try to minimize the loss. In our results, to quantify the fairness guarantees, we plot and on the test set. Lower the values, better is the guarantee. For readability we refer and as DemP and EO in our results.
Demographic Parity: Figures 3(a), 4(a), and 5(a). We consider an FL setting with 5 agents for ensuring DemP. For the Adult dataset, Figure 3(a), we find that for B1, we get an accuracy of and a DemP of . We observe that a model trained with fairness constraints, i.e., for B2, has a reduced accuracy of , but DemP reduces to . We find similar trends in the baselines for the Bank (Figure 4(a)) and the Dutch datasets (Figure 5(a)).
Introducing privacy guarantees with FPFL, we observe a further compromise in either accuracy and fairness as compared to our baselines. In general, with increasing , i.e., increasing privacy loss, there is an improvement in the tradeoff of accuracy and DemP. For , the accuracy and DemP are similar to that in B2. While the drop in accuracy is consistent with decrease in , DemP values do not always follow this trend.
Equalized Odds: Figures 3(b), 4(b), and 5(b). For EO, we consider FL setting with only 2 agents. From Figure 3(b), we find in B1 the accuracy is for the Adult dataset with EO as . With B2, we obtain reduced accuracy of , but EO reduces to . We find similar trends in the baselines for the Bank (Figure 4(b)) and the Dutch datasets (Figure 5(b)).
When we compare the FPFL training, which also guarantees privacy, we observe a tradeoff in fairness and accuracy. We note that ensuring EO, especially in the Bank dataset, is very challenging. Therefore, the tradeoff is not as smooth. With decrease in , the accuracy decreases, but the EO values do not follow any trend. We believe this is due to the lack of distinct samples for each category after splitting the data (despite duplication) for FL.
Future Work. Our goal is to establish the FPFL framework, where a user can customize and to ensure the desired performance. The framework allows the use of any fairness measure of choice by appropriately modifying the loss in Phase 1. Exploring these on other relevant datasets and exploring other fairness [16, 9] and privacy techniques [27] with varying number of clients is left for future work.
5 Conclusion
We provide a framework that learns fair and accurate models while preserving privacy. We refer to our novel framework as FPFL (Figure 2). We showed that decoupling the training process into separated phases for fairness and privacy allowed us to provide a DP guarantee for the training data and sensitive attributes and reduce the number of training timesteps. We then applied FPFL on the Adult, Bank and Dutch datasets to highlight the relation between accuracy, fairness, and privacy of an FL model.
References
 [1] (2016) Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, pp. 308–318. Cited by: §2, §3, Algorithm 2.
 [2] (201810–15 Jul) A reductions approach to fair classification. In Proceedings of the 35th International Conference on Machine Learning, J. Dy and A. Krause (Eds.), Proceedings of Machine Learning Research, Vol. 80, Stockholmsmässan, Stockholm Sweden, pp. 60–69. External Links: Link Cited by: §1.
 [3] (2019) Differential privacy has disparate impact on model accuracy. Advances in Neural Information Processing Systems 32, pp. 15479–15488. Cited by: §1.
 [4] (2016) Big data’s disparate impact. Cal. L. Rev. 104, pp. 671. Cited by: §1.
 [5] (2018) Fairness in criminal justice risk assessments: the state of the art. Sociological Methods & Research, pp. 0049124118782533. Cited by: §1.
 [6] (201507) Fairness Constraints: Mechanisms for Fair Classification. ArXiv eprints. External Links: 1507.05259 Cited by: §1.
 [7] (2017) Fair prediction with disparate impact: a study of bias in recidivism prediction instruments. Big data 5 2, pp. 153–163. Cited by: §1.
 [8] (2019) On the compatibility of privacy and fairness. In Adjunct Publication of the 27th Conference on User Modeling, Adaptation and Personalization, pp. 309–315. Cited by: Table 1.
 [9] (2021) Fairnessaware agnostic federated learning. In Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 181–189. Cited by: Table 1, §4.
 [10] (2012) Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference, pp. 214–226. Cited by: item 1.
 [11] (2014) The algorithmic foundations of differential privacy.. Foundations and Trends in Theoretical Computer Science 9 (34), pp. 211–407. Cited by: item 2, Definition 3.
 [12] (2021) Privacy preserving machine learning with homomorphic encryption and federated learning. Future Internet 13 (4), pp. 94. Cited by: §1.
 [13] (2015) Model inversion attacks that exploit confidence information and basic countermeasures. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 1322–1333. Cited by: §2, §2.

[14]
(2016)
Equality of opportunity in supervised learning
. In NIPS, Cited by: item 1.  [15] (2015) Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531. Cited by: §3.
 [16] (2020) Fair resource allocation in federated learning. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 2630, 2020, External Links: Link Cited by: §1, §4.
 [17] (2018) Learning adversarially fair and transferable representations. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 1015, 2018, pp. 3381–3390. External Links: Link Cited by: §1.
 [18] (2017) Communicationefficient learning of deep networks from decentralized data. In Artificial Intelligence and Statistics, pp. 1273–1282. Cited by: §1.
 [19] (2017) Secureml: a system for scalable privacypreserving machine learning. In 2017 IEEE Symposium on Security and Privacy (SP), pp. 19–38. Cited by: §1.
 [20] (2021) A survey on security and privacy of federated learning. Future Generation Computer Systems 115, pp. 619–640. Cited by: §1.
 [21] (2020) Fair learning with private demographic data. In International Conference on Machine Learning, pp. 7066–7075. Cited by: Table 1.
 [22] (2020) Toward robustness and privacy in federated learning: experimenting with local and central differential privacy. arXiv preprint arXiv:2009.03561. Cited by: Table 1, §1.
 [23] (202007) FNNC: achieving fairness through neural networks. In Proceedings of the TwentyNinth International Joint Conference on Artificial Intelligence, IJCAI20, C. Bessiere (Ed.), pp. 2277–2283. Note: Main track External Links: Document, Link Cited by: §1, §3, §4.
 [24] (2010) Multiparty differential privacy via aggregation of locally trained classifiers.. In NIPS, pp. 1876–1884. Cited by: §1.
 [25] (2015) Privacypreserving deep learning. In Proceedings of the 22nd ACM SIGSAC conference on computer and communications security, pp. 1310–1321. Cited by: §1.
 [26] (2020) Differentially private and fair deep learning: a lagrangian dual approach. arXiv preprint arXiv:2009.12562. Cited by: Table 1, §1.
 [27] (2019) Federated learning with bayesian differential privacy. In 2019 IEEE International Conference on Big Data (Big Data), pp. 2587–2596. Cited by: Table 1, §4.
 [28] (2021) Federated machine learning: survey, multilevel classification, desirable criteria and future directions in communication and networking systems. IEEE Communications Surveys & Tutorials. Cited by: §1, §2.
 [29] (2020) Federated learning with differential privacy: algorithms and performance analysis. IEEE Transactions on Information Forensics and Security 15 (), pp. 3454–3469. External Links: Document Cited by: Table 1, §1.
 [30] (2019) Federated machine learning: concept and applications. ACM Transactions on Intelligent Systems and Technology (TIST) 10 (2), pp. 1–19. Cited by: §1.
 [31] (2020) Batchcrypt: efficient homomorphic encryption for crosssilo federated learning. In 2020 USENIX Annual Technical Conference (USENIXATC 20), pp. 493–506. Cited by: §1.
 [32] (2011) Handling conditional discrimination. In 2011 IEEE 11th International Conference on Data Mining, Vol. , pp. 992–1001. External Links: Document Cited by: §4.
Comments
There are no comments yet.