Towards Multi-Objective Statistically Fair Federated Learning

Federated Learning (FL) has emerged as a result of data ownership and privacy concerns to prevent data from being shared between multiple parties included in a training procedure. Although issues, such as privacy, have gained significant attention in this domain, not much attention has been given to satisfying statistical fairness measures in the FL setting. With this goal in mind, we conduct studies to show that FL is able to satisfy different fairness metrics under different data regimes consisting of different types of clients. More specifically, uncooperative or adversarial clients might contaminate the global FL model by injecting biased or poisoned models due to existing biases in their training datasets. Those biases might be a result of imbalanced training set (Zhang and Zhou 2019), historical biases (Mehrabi et al. 2021a), or poisoned data-points from data poisoning attacks against fairness (Mehrabi et al. 2021b; Solans, Biggio, and Castillo 2020). Thus, we propose a new FL framework that is able to satisfy multiple objectives including various statistical fairness metrics. Through experimentation, we then show the effectiveness of this method comparing it with various baselines, its ability in satisfying different objectives collectively and individually, and its ability in identifying uncooperative or adversarial clients and down-weighing their effect

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 4

page 5

11/09/2021

Unified Group Fairness on Federated Learning

Federated learning (FL) has emerged as an important machine learning par...
08/19/2021

Fair and Consistent Federated Learning

Federated learning (FL) has gain growing interests for its capability of...
11/02/2021

A Survey of Fairness-Aware Federated Learning

Recent advances in Federated Learning (FL) have brought large-scale mach...
11/11/2021

Fairness, Integrity, and Privacy in a Scalable Blockchain-based Federated Learning System

Federated machine learning (FL) allows to collectively train models on s...
02/01/2022

Federated Active Learning (F-AL): an Efficient Annotation Strategy for Federated Learning

Federated learning (FL) has been intensively investigated in terms of co...
09/13/2021

Source Inference Attacks in Federated Learning

Federated learning (FL) has emerged as a promising privacy-aware paradig...
03/30/2020

End-to-End Evaluation of Federated Learning and Split Learning for Internet of Things

This work is the first attempt to evaluate and compare felderated learni...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

Introduction

Federated Learning (FL) has recently gained significant attention as a learning paradigm in which a central server orchestrates different clients and aggregates their models to obtain a global federated model. In this setup, the server has no access to the clients’ data. Thus, the data remains local to the clients on which clients will train their own models over. In each round, the server will send a global model to the clients. Clients will receive the model and will train it further on their own local datasets. They will then send their models back to the server, and the server will aggregate these results in a secure manner. The server will use this new model for the next iteration as the initial model to be sent to each client.

Research has shown that such a learning framework can be beneficial to preserve privacy of the data Thakkar et al. (2020). This along with other advantages that FL provides, including but not limited to not requiring the data to be transferred between many parties, has made FL a recent popular topic of interest. With the advancement of research and widespread interest in federated learning, different approaches have been proposed for private Truong et al. (2020), personalized Fallah et al. (2020), and fair Li et al. (2020); Mohri et al. (2019) federated learning. Although methods have been proposed under fair FL literature by trying to make clients have uniform test accuracies Li et al. (2020); Mohri et al. (2019), not much attention has been given to standard statistical fairness definitions. Amongst the ones that consider statistical fairness metrics Cui et al. (2021), existence of uncooperative or adversarial clients have not been considered. Specifically, methods that rely on clients to locally satisfy fairness metrics Cui et al. (2021) can be ineffective or unreliable in the existence of adversarial clients.

With these goals in mind, we propose a framework that is able to satisfy different objectives including but not limited to different fairness objectives. This framework is also able to identify uncooperative or adversarial clients who might inject poisoned, unfair, or poor quality models to the overall FL system Mehrabi et al. (2021); Solans et al. (2020). This framework can also be considered as a verification or auditing framework in FL setups. In addition, we conduct studies to verify how this framework will work to satisfy different statistical fairness metrics under different conditions with the existence of different clients and compare this method against baselines, such as Federated Averaging (FedAvg) McMahan et al. (2017), Agnostic Federated Learning (AFL) Mohri et al. (2019), FCFL Cui et al. (2021), and other baselines from the fair FL literature Li et al. (2020).

To summarize, in this work we aim to answer the following questions:

  1. In federated learning setup, where data is local to the clients and access to the sensitive attributes is considered to be a challenge, is it possible to satisfy different statistical fairness metrics, audit, and verify clients’ models?

  2. Is it possible to satisfy multiple objectives including statistical fairness metrics using federated learning?

  3. Can we identify and mitigate the effect of uncooperative or adversarial clients who might inject malicious, unfair, and in general poor quality models into the federated learning system and instead reward better clients?

Methodology

Background

FL has a client-server setup in which server’s job is to orchestrate existing clients who each train local models on their own local datasets. These different trained models are then aggregated by the server in a new overall federated model. This aggregated model will then be used by the server for further training rounds which will be sent to the clients. The objective of the FL framework in this setup can be written as:

(1)

Where is the local objective for client , is the loss on example of client using model parameters . Notice that one can solve this problem by considering

as the probability that client

’s model will be incorporated into the federated model during the aggregation process by the server. For instance, in FedAvg McMahan et al. (2017) , where is the number of examples in client ’s local dataset and is the total number of examples.

Input: number of clients; weight for each objective ; local minibatch size;

number of local epochs;

learning rate.
Output: final federated model.
Server Side:

     initialize
for t= 1,2,… do
            for each client in parallel do
                  ClientUpdate(,)
Validate client ’s model when temporarily aggregated with the global FL model and calculate
            end for
           Rank each client based on their scores and assign the rank score to each client . (//Optional step refer to the Ranking Algorithm for more details).
(// can be replaced by if the optional step is skipped.)
      end for
     return
     
     Client Side:
          ClientUpdate(,):
               for each local epoch i from 1 to  do
                      for each batch with size  do
                           
                      end for
                     
                end for
               return to the server
               
               
               
Algorithm 1 FedVal Algorithm

Input: Initial Step ; Step Size .
Output: Clients’ rank scores .
Initialize with zeros for the first round; otherwise, use scores from the previous rounds.
Sorted Sort the clients based on their obtained scores from the FedVal algorithm.
for each client in Sorted do

       +=
end for
return
Algorithm 2 Ranking Algorithm

FedVal

To be able to satisfy different objectives including existing statistical fairness measures and be able to detect and down-weight the effect of uncooperative or adversarial clients who might train their models on imbalanced Zhang and Zhou (2019), poisoned Mehrabi et al. (2021); Solans et al. (2020), or poor quality data Mehrabi et al. (2021) that can contribute to the unfairness of their models, we decided to dedicate a central validation set for the server using which the server can then assign scores to the clients for validation purposes. This validation or verification step has a couple of advantages: (1) it gives the server a dataset on which it can compute fairness measures with regards to existing sensitive attributes in the data that can be used for auditing client models, (2) the server can compute scores for each client model and weight each client accordingly, (3) the validation set can audit the FL model with regards to any sensitive attribute a practitioner would have in mind making this framework flexible. Notice that in this setup, we assume that the server is a trustworthy party that plays a role in auditing the FL model and has the responsibility of assuring that the global model is a reliable model; however, we do not have the same trust toward clients and assume that there might exist adversarial or uncooperative clients in the system.

Although the idea of having a validation set in FL setup is not new Stripelis and Ambite (2020), our framework and aggregation mechanism are different. In addition, our work is focused on analyzing whether or not FL can satisfy and/or find a compromise between different fairness definitions which is an under-explored direction especially when considering the existence of uncooperative or adversarial clients.

To this end, we propose to use the same objective as in 1 except now where is the score that model from client has obtained for objective and corresponds to the weight of objective . is the normalizing factor. This way, the model can find a compromise for multiple objectives, it can work with measures that need access to some data, such as fairness measures that require access to sensitive attributes. Moreover, it allows to identify uncooperative or adversarial clients through validation and verification. With this objective in mind, we write our federated learning algorithm as shown in Algorithm 1.

As shown in Algorithm 1, in each training round, the server sends a model to clients to train on their own dataset. The server will then fetch the clients’ models and validate their contribution to the overall FL model and calculate the validation scores for each client. The clients’ models will then be aggregated using a weighted average based on the validation scores. In this setting, we are trying to choose the best set of clients using our validation data. Thus, in a sense we are auditing each of the models trained by each client.

In order to make the client scoring more uniform and controllable, we also propose the optional ranking step that is explained in more details in Algorithm 2. This gives the server more control on assigning rewards to cooperative clients or punishing uncooperative clients by the budget that the validator server decides. Here, we propose one way of doing this ranking as an additional and optional step.

Obtaining a validation set

The choice of the validation set can be a challenge. Here we provide some options through which the central server can obtain a validation set in practice to perform the audits. The server could:

1. Use historical data that the centralized model used for training before the existence of the FL framework as a validation data to validate the clients. A downside to this can be that historical data might get outdated after a while, so the server might need to get a new type of data for its validation. However, with the existence of the federated framework, it would be harder for the server to obtain such data.

2. Use publicly available data. A downside to this can be that public data might not well represent the clients in mind.

3. Ask clients to donate some of their data for validation only. The downside of this would be that some of the clients’ data would now be shared which is at odds with the federated learning framework where data stays local to each client and is not shared. However, such an approach would have lesser privacy concerns than typical centralized learning since only a small subset of clients’ data could be shared for validation purposes, which might be acceptable in some settings.

4. Make clients to become trainers and validators. Each client’s data can be split into train and validation sets. A similar approach was proposed in Stripelis and Ambite (2020). A downside to this approach is that it can be costly as all clients need to both train and validate all the other clients. In addition, some clients may be adversarial and assign low or inaccurate scores to other clients.

5. Dedicate certain clients to only be validators and use their data for validation only to avoid the cost of the previous suggestion. This can bring down the cost issue as not all the clients need to both train and validate. In this case, we have dedicated trainers and validators. However, we still have the other remaining downside: the existence of adversarial clients among validators.

However, notice that despite the aforementioned challenges, having a validation set can have numerous advantages as previously mentioned. It can also give flexibility to the practitioner who audits such models as they can use and curate validation data that is appropriate for the use-case they have in mind considering demographic groups that might be more susceptible to be targets of a biased FL model depending on where the FL model is being deployed.

Experiments and Results

We conduct two major studies to first demonstrate the effectiveness of the FedVal algorithm when compared to different existing FL algorithms and also its ability to satisfy multiple objectives. We then conduct studies to demonstrate the effect of having different ratios of cooperative vs uncooperative clients in the FedVal algorithm. In the following sections, we are going to discuss the experiments we performed including the datasets used and the results.

Datasets

We conducted our experiments on two datasets: the UCI Adult dataset 111https://archive.ics.uci.edu/ml/datasets/adult which contains census data. The prediction task is whether or not an individual’s income exceeds , gender (male or female) is the sensitive attribute, and the Heritage Health dataset 222https://www.kaggle.com/c/hhp

which contains patient information and the task is to predict the Charleson Index (a survival indicator). We used age as a binarized sensitive attribute (

vs. years old).

Metrics

We report our results using accuracy as a performance metric as well as Statistical Parity Difference (SPD) Dwork et al. (2012) and Equality of Opportunity Difference (EOD) Hardt et al. (2016) measures which are widely known statistical fairness metrics defined as follows:

Where represents the advantaged demographic group and the disadvantaged group. SPD captures the differences between (advantaged and disadvantaged) demographic groups in getting assigned the positive outcome.

EOD captures differences in the true positive rates among different (advantaged and disadvantaged) demographic groups.

Figure 1: FedVal results compared to different baseline FL algorithms. FedVal is shown to be able to maintain a good balance in satisfying all the three objectives collectively without sacrificing accuracy for the price of fairness.
Figure 2: Results comparing FCFL optimized for a specific fairness objective vs. FedVal optimized for the same corresponding fairness objective. We observe that FedVal is able to obtain more fair results by having lower SPD and EOD while maintaining higher accuracy values compared to FCFL. Lower Statistical Parity and Equality of Opportunity differences represent lower bias with regards to these two fairness measures; thus, higher fairness and better results.
Figure 3: FedVal results for when it satisfies all the three objectives collectively vs. each objective individually. Lower Statistical Parity and Equality of Opportunity differences represent lower bias with regards to these two fairness measures; thus, higher fairness and better results.

Verifying FedVal

To verify the FedVal algorithm, we considered two aspects: (1) Verifying the effectiveness of FedVal compared to different existing FL algorithms, (2) Testing FedVal’s ability in satisfying different objectives including different statistical fairness metrics along with accuracy. To put FedVal to the test, we needed different varieties of clients, the cooperative clients who would train less biased models, the uncooperative clients who would intentionally train their models on skewed/imbalanced data to bias their trained models 

Mehrabi et al. (2021); Solans et al. (2020); Zhang and Zhou (2019) and normal clients. To satisfy these objectives, for this set of experiments, we created 10 clients with different varieties and evaluated FedVal accordingly.

Against Baselines

We compared FedVal against baselines in its ability to satisfy three objectives: fairness objectives such as Statistical Parity Difference (SPD) Dwork et al. (2012), Equality of Opportunity Difference (EOD) Hardt et al. (2016) and accuracy. The baselines considered were FedAvg McMahan et al. (2017), AFL Mohri et al. (2019), q-FedAvg Li et al. (2020), q-FedSGD Li et al. (2020), and FCFL Cui et al. (2021). In this section, we set up FedVal to optimize for the three objectives, namely SPD, EOD, and accuracy, collectively. In a follow-up experiment, we show results on FedVal satisfying each of these objectives individually. For the baselines, we used standard FL algorithms as well as algorithms designed for fairness in the FL setup. FedAvg is a widely known standard FL algorithm. AFL is a FL algorithm designed to satisfy fairness in FL setup based on the notion of good-intent fairness in which the goal is to minimize the maximum loss obtained by any protected class/client. In simpler words, the goal in AFL is to maximize the performance of the worst agent. q-FedAvg and q-FedSGD are algorithms also specifically designed to obtain fairness in FL setting in which the goal is for clients to obtain more uniform accuracy. Finally, FCFL DP and FCFL Eop which are recent FL algorithms designed to satisfy statistical fairness objectives in which FCFL DP is optimized for SPD specifically and FCFL Eop fro EOD. We report the averaged results over three different data splits along with error bars in Fig 1.

Results

The results in Fig 1 demonstrate that FedVal is able to effectively compromise for different objectives by obtaining a balanced results in satisfying fairness measures as well as accuracy without sacrificing one objective over another. Although FCFL is able to obtain fair outcomes with regards to the objective that it is optimized for, notice that FCFL is sacrificing accuracy which is not the case for FedVal that is trying to obtain a balance between all the objectives collectively. We also show in Fig 2 that FedVal is able to outperform FCFL in satisfying fairness metrics when FedVal is set to satisfy the corresponding fairness measure that FCFL is specifically optimized for which is a more fair comparison to have. These results also demonstrate the ability of FedVal in effectively identifying poor quality models injected by the uncooperative or adversarial clients, diminishing their effects through validation, and obtaining more fair outcomes.

Against Different Objectives

In addition to baselines, we verified FedVal’s ability in satisfying different objectives. We compared results when FedVal optimizes different objectives individually vs when it optimizes all the objectives collectively. This experiment also demonstrates the impact of varying the weights of the objectives in FedVal since not considering an objective in FedVal will be the equivalent to zeroing out the weight for the corresponding objective. We considered accuracy to test the general performance of the model along with Statistical Parity Difference (SPD) and Equality of Opportunity Difference (EOD) as standard fairness metrics as discussed and used in previous sections. Similar to the results reported in the previous section, we report the averaged results over three different data splits along with error bars in Fig 3.

Results in Fig 3 demonstrate that FedVal is able to satisfy each of the objectives including statistical fairness metrics individually. These results are nice additions to our previous results in which FedVal was shown to maintain a balance in satisfying different objectives collectively.

Fedval with Different Client Ratios

In addition to verifying FedVal, we perform experiments here to demonstrate the effect of having different types of clients, namely uncooperative vs cooperative in our framework. With this goal in mind, we applied FedVal on different sets of datasets each having different ratios of cooperative clients. In these sets of experiments, we concentrate on fairness metrics specifically. Thus, we skew the datasets for certain clients to make them produce biased models trained on imbalanced datasets and call them the uncooperative clients who inject biased models vs. clients who would train their models on a balanced datasets and would inject more fair models compared to the uncooperative clients. Similar to the experiments performed in previous sections, we utilize SPD and EOD as our fairness measures. In addition, we report results having the optional ranking schema in our introduced algorithm compared to not having the specific ranking schema.

Results in Fig 4 demonstrate that by increasing the number of cooperative clients who would inject more fair models compared to the unfair clients, the bias in terms of SPD and EOD will decrease as expected. In addition, this decrease is more noticeable when we use our introduced optional ranking schema. One can observe that in the ranking setup the significant decrease in bias happens around when we have  3% of cooperative users in both of the datasets compared to 10%-15% in the no ranking setup. This is because the ranking schema punishes the uncooperative users more harshly and rewards the cooperative users more aggressively compared to a setup where there is no ranking schema. Of course, one could design other ranking strategies and control this trade-off according to their use-case. These results once again verify the effectiveness of FedVal in identifying adversarial or uncooperative clients and its effectiveness in reducing their effects in the FL model. This demonstrates FedVal’s verification, validation, and auditing capabilities.

Figure 4: Different cooperative client ratio results on Adult and Health according to SPD and EOD fairness measures.

Related Work

Federated Learning

Research in federated learning has expanded drastically in recent years Kairouz et al. (2019). Not only work has been done to optimize federated learning in general Wang et al. (2020); Li et al. (2018), but the research has expanded to other areas of research as well, such as privacy and federated learning Truong et al. (2020), personalization and federated learning Fallah et al. (2020), and more recently work has been done in the context of fairness and federated learning Li et al. (2020); Mohri et al. (2019). In addition, federated learning has gained significant importance in health care domain Rieke et al. (2020)

and many other applications of machine learning. Not only in machine learning, but work has expanded to Natural Language Processing (NLP) as well

Lin et al. (2021). This significance and penetration of federated learning into different applications, including sensitive applications that can affect our society, brings the need to think about fairness implications of federated learning.

Fairness

Similar to federated learning, research in fair Machine Learning (ML) and Natural Language Processing (NLP) has gained significant popularity in recent years Mehrabi et al. (2021). Different statistical fairness metrics have been proposed as measures for fairness, such as statistical parity Dwork et al. (2012) and equality of opportunity Hardt et al. (2016). We utilized some of these measures in our studies as well. However, work in fairness does not conclude itself in proposing measures. Researchers try to constantly make different existing algorithms and models more fair in different tasks and applications, such as classification Zafar et al. (2017) and regression Agarwal et al. (2019) in general ML, and many other NLP applications, such as translation Basta et al. (2020), language generation Liu et al. (2020)

, named entity recognition 

Mehrabi et al. (2020), and commonsense reasoning tasks Mehrabi et al. (2021). Thus, we see federated learning algorithms no exception from being included in such studies considering its applications in various different sensitive environments, such as healthcare systems.

Fair Federated Learning

Some previous work tackled the fairness problem in FL Li et al. (2020); Mohri et al. (2019); Hao et al. (2021); Zhang et al. (2020); Yu et al. (2020). However, they mostly focused on making the FL setup fair for the participating clients and did not consider statistical fairness metrics. Amongst the ones that considered statistical fairness metrics Cui et al. (2021), they either considered satisfying such metrics locally by trusting the clients which might not be effective in case of having adversarial clients, or in general they did not consider cases where auditing and verification is needed, such as cases where the client data itself might be intentionally biased or poisoned Mehrabi et al. (2021); Solans et al. (2020) and can corrupt the final global FL model. Thus, although most of the previous work in fair federated learning focused on having a framework in which clients with different data distributions can be treated fairly and similarly to each other, not much attention has been given to standard statistical fairness metrics with regards to the existing sensitive attributes in the data and the destructive outcomes the unfair FL model can have in the existence of adversarial, uncooperative, or unfair clients who can train unfair models by poisoning their data instances Mehrabi et al. (2021). Even if the clients may not be adversarial, chances are that some clients may be training their models on an unintentionally biased data Mehrabi et al. (2021) that can corrupt the overall FL model. That is one of the motivations for our work.

Conclusion

In this work, we proposed a new simple yet effective FL framework, FedVal, that is able to satisfy multiple objectives including various fairness measures and compared it with other FL baseline algorithms. We analyzed FedVal’s capability in satisfying statistical fairness metrics in different scenarios with varying ratios of uncooperative or adversarial clients. In addition, We showed that FedVal is able to reduce the bias introduced by uncooperative or adversarial clients. By including a validation step to rate clients, FedVal is able to achieve higher fairness. As a future direction, it would be interesting to add privacy and robustness objectives and analyze whether FedVal can satisfy those as well. In addition, one could investigate the incompatibility between fairness definitions Kleinberg et al. (2016) in the FL setting. For instance, in which settings one could reduce one type of bias at the expense of another fairness measure, in which settings is it possible to optimize all fairness objectives? It would also be interesting to compare and contrast these issues in FL vs centralized settings.

References

  • A. Agarwal, M. Dudik, and Z. S. Wu (2019) Fair regression: quantitative definitions and reduction-based algorithms. In Proceedings of the 36th International Conference on Machine Learning, K. Chaudhuri and R. Salakhutdinov (Eds.), Proceedings of Machine Learning Research, Vol. 97, pp. 120–129. External Links: Link Cited by: Fairness.
  • C. Basta, M. R. Costa-jussà, and J. A. R. Fonollosa (2020)

    Towards mitigating gender bias in a decoder-based neural machine translation model by adding contextual information

    .
    In Proceedings of the The Fourth Widening Natural Language Processing Workshop, Seattle, USA, pp. 99–102. External Links: Link, Document Cited by: Fairness.
  • S. Cui, W. Pan, J. Liang, C. Zhang, and F. Wang (2021) Addressing algorithmic disparity and performance inconsistency in federated learning. arXiv preprint arXiv:2108.08435. Cited by: Introduction, Introduction, Against Baselines, Fair Federated Learning.
  • C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel (2012) Fairness through awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, ITCS ’12, New York, NY, USA, pp. 214–226. External Links: ISBN 9781450311151, Link, Document Cited by: Metrics, Against Baselines, Fairness.
  • A. Fallah, A. Mokhtari, and A. Ozdaglar (2020) Personalized federated learning with theoretical guarantees: a model-agnostic meta-learning approach. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (Eds.), Vol. 33, pp. 3557–3568. External Links: Link Cited by: Introduction, Federated Learning.
  • W. Hao, M. El-Khamy, J. Lee, J. Zhang, K. J. Liang, C. Chen, and L. C. Duke (2021) Towards fair federated learning with zero-shot data augmentation. In

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    ,
    pp. 3310–3319. Cited by: Fair Federated Learning.
  • M. Hardt, E. Price, E. Price, and N. Srebro (2016)

    Equality of opportunity in supervised learning

    .
    In Advances in Neural Information Processing Systems, D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett (Eds.), Vol. 29, pp. . External Links: Link Cited by: Metrics, Against Baselines, Fairness.
  • P. Kairouz, H. B. McMahan, B. Avent, A. Bellet, M. Bennis, A. N. Bhagoji, K. Bonawitz, Z. Charles, G. Cormode, R. Cummings, et al. (2019) Advances and open problems in federated learning. arXiv preprint arXiv:1912.04977. Cited by: Federated Learning.
  • J. Kleinberg, S. Mullainathan, and M. Raghavan (2016) Inherent trade-offs in the fair determination of risk scores. arXiv preprint arXiv:1609.05807. Cited by: Conclusion.
  • T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, and V. Smith (2018) Federated optimization in heterogeneous networks. arXiv preprint arXiv:1812.06127. Cited by: Federated Learning.
  • T. Li, M. Sanjabi, A. Beirami, and V. Smith (2020) Fair resource allocation in federated learning. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020, External Links: Link Cited by: Introduction, Introduction, Against Baselines, Federated Learning, Fair Federated Learning.
  • B. Y. Lin, C. He, Z. Zeng, H. Wang, Y. Huang, M. Soltanolkotabi, X. Ren, and S. Avestimehr (2021) FedNLP: a research platform for federated learning in natural language processing. arXiv preprint arXiv:2104.08815. Cited by: Federated Learning.
  • H. Liu, W. Wang, Y. Wang, H. Liu, Z. Liu, and J. Tang (2020) Mitigating gender bias for neural dialogue generation with adversarial learning. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, pp. 893–903. External Links: Link, Document Cited by: Fairness.
  • B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas (2017) Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pp. 1273–1282. Cited by: Introduction, Background, Against Baselines.
  • N. Mehrabi, T. Gowda, F. Morstatter, N. Peng, and A. Galstyan (2020) Man is to person as woman is to location: measuring gender bias in named entity recognition. In Proceedings of the 31st ACM Conference on Hypertext and Social Media, HT ’20, New York, NY, USA, pp. 231–232. External Links: ISBN 9781450370981, Link, Document Cited by: Fairness.
  • N. Mehrabi, F. Morstatter, N. Saxena, K. Lerman, and A. Galstyan (2021) A survey on bias and fairness in machine learning. ACM Comput. Surv. 54 (6). External Links: ISSN 0360-0300, Link, Document Cited by: Towards Multi-Objective Statistically Fair Federated Learning, FedVal, Fairness, Fair Federated Learning.
  • N. Mehrabi, M. Naveed, F. Morstatter, and A. Galstyan (2021) Exacerbating algorithmic bias through fairness attacks. Proceedings of the AAAI Conference on Artificial Intelligence 35 (10), pp. 8930–8938. External Links: Link Cited by: Towards Multi-Objective Statistically Fair Federated Learning, Introduction, FedVal, Verifying FedVal, Fair Federated Learning.
  • N. Mehrabi, P. Zhou, F. Morstatter, J. Pujara, X. Ren, and A. Galstyan (2021) Lawyers are dishonest? quantifying representational harms in commonsense knowledge resources. arXiv preprint arXiv:2103.11320. Cited by: Fairness.
  • M. Mohri, G. Sivek, and A. T. Suresh (2019) Agnostic federated learning. In International Conference on Machine Learning, pp. 4615–4625. Cited by: Introduction, Introduction, Against Baselines, Federated Learning, Fair Federated Learning.
  • N. Rieke, J. Hancox, W. Li, F. Milletari, H. R. Roth, S. Albarqouni, S. Bakas, M. N. Galtier, B. A. Landman, K. Maier-Hein, et al. (2020) The future of digital health with federated learning. NPJ digital medicine 3 (1), pp. 1–7. Cited by: Federated Learning.
  • D. Solans, B. Biggio, and C. Castillo (2020) Poisoning attacks on algorithmic fairness. arXiv preprint arXiv:2004.07401. Cited by: Towards Multi-Objective Statistically Fair Federated Learning, Introduction, FedVal, Verifying FedVal, Fair Federated Learning.
  • D. Stripelis and J. L. Ambite (2020) Accelerating federated learning in heterogeneous data and computational environments. arXiv preprint arXiv:2008.11281. Cited by: FedVal, Obtaining a validation set.
  • O. Thakkar, S. Ramaswamy, R. Mathews, and F. Beaufays (2020) Understanding unintended memorization in federated learning. arXiv preprint arXiv:2006.07490. Cited by: Introduction.
  • N. Truong, K. Sun, S. Wang, F. Guitton, and Y. Guo (2020) Privacy preservation in federated learning: an insightful survey from the gdpr perspective. arXiv preprint arXiv:2011.05411. Cited by: Introduction, Federated Learning.
  • H. Wang, Z. Kaplan, D. Niu, and B. Li (2020)

    Optimizing federated learning on non-iid data with reinforcement learning

    .
    In IEEE INFOCOM 2020 - IEEE Conference on Computer Communications, Vol. , pp. 1698–1707. External Links: Document Cited by: Federated Learning.
  • H. Yu, Z. Liu, Y. Liu, T. Chen, M. Cong, X. Weng, D. Niyato, and Q. Yang (2020) A fairness-aware incentive scheme for federated learning. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, AIES ’20, New York, NY, USA, pp. 393–399. External Links: ISBN 9781450371100, Link, Document Cited by: Fair Federated Learning.
  • M. B. Zafar, I. Valera, M. G. Rogriguez, and K. P. Gummadi (2017) Fairness Constraints: Mechanisms for Fair Classification. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, A. Singh and J. Zhu (Eds.), Proceedings of Machine Learning Research, Vol. 54, Fort Lauderdale, FL, USA, pp. 962–970. External Links: Link Cited by: Fairness.
  • D. Y. Zhang, Z. Kou, and D. Wang (2020) FairFL: a fair federated learning approach to reducing demographic bias in privacy-sensitive classification models. In 2020 IEEE International Conference on Big Data (Big Data), Vol. , pp. 1051–1060. External Links: Document Cited by: Fair Federated Learning.
  • Y. Zhang and L. Zhou (2019) Fairness assessment for artificial intelligence in financial industry. arXiv preprint arXiv:1912.07211. Cited by: Towards Multi-Objective Statistically Fair Federated Learning, FedVal, Verifying FedVal.