Fairness in Semi-supervised Learning: Unlabeled Data Help to Reduce Discrimination

09/25/2020
by   Tao Zhang, et al.
9

A growing specter in the rise of machine learning is whether the decisions made by machine learning models are fair. While research is already underway to formalize a machine-learning concept of fairness and to design frameworks for building fair models with sacrifice in accuracy, most are geared toward either supervised or unsupervised learning. Yet two observations inspired us to wonder whether semi-supervised learning might be useful to solve discrimination problems. First, previous study showed that increasing the size of the training set may lead to a better trade-off between fairness and accuracy. Second, the most powerful models today require an enormous of data to train which, in practical terms, is likely possible from a combination of labeled and unlabeled data. Hence, in this paper, we present a framework of fair semi-supervised learning in the pre-processing phase, including pseudo labeling to predict labels for unlabeled data, a re-sampling method to obtain multiple fair datasets and lastly, ensemble learning to improve accuracy and decrease discrimination. A theoretical decomposition analysis of bias, variance and noise highlights the different sources of discrimination and the impact they have on fairness in semi-supervised learning. A set of experiments on real-world and synthetic datasets show that our method is able to use unlabeled data to achieve a better trade-off between accuracy and discrimination.

READ FULL TEXT

page 8

page 11

page 12

research
09/14/2020

Fairness Constraints in Semi-supervised Learning

Fairness in machine learning has received considerable attention. Howeve...
research
12/31/2019

Leveraging Semi-Supervised Learning for Fairness using Neural Networks

There has been a growing concern about the fairness of decision-making s...
research
01/02/2022

Fair Data Representation for Machine Learning at the Pareto Frontier

As machine learning powered decision making is playing an increasingly i...
research
11/03/2021

Can We Achieve Fairness Using Semi-Supervised Learning?

Ethical bias in machine learning models has become a matter of concern i...
research
03/24/2022

Addressing Missing Sources with Adversarial Support-Matching

When trained on diverse labeled data, machine learning models have prove...
research
02/09/2016

Minimax Lower Bounds for Realizable Transductive Classification

Transductive learning considers a training set of m labeled samples and ...
research
12/06/2017

Product Function Need Recognition via Semi-supervised Attention Network

Functionality is of utmost importance to customers when they purchase pr...

Please sign up or login with your details

Forgot password? Click here to reset