Gender Slopes: Counterfactual Fairness for Computer Vision Models by Attribute Manipulation

05/21/2020 ∙ by Jungseock Joo, et al. ∙ 0

Automated computer vision systems have been applied in many domains including security, law enforcement, and personal devices, but recent reports suggest that these systems may produce biased results, discriminating against people in certain demographic groups. Diagnosing and understanding the underlying true causes of model biases, however, are challenging tasks because modern computer vision systems rely on complex black-box models whose behaviors are hard to decode. We propose to use an encoder-decoder network developed for image attribute manipulation to synthesize facial images varying in the dimensions of gender and race while keeping other signals intact. We use these synthesized images to measure counterfactual fairness of commercial computer vision classifiers by examining the degree to which these classifiers are affected by gender and racial cues controlled in the images, e.g., feminine faces may elicit higher scores for the concept of nurse and lower scores for STEM-related concepts. We also report the skewed gender representations in an online search service on profession-related keywords, which may explain the origin of the biases encoded in the models.



There are no comments yet.


page 3

page 4

page 7

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.


Artificial Intelligence has made remarkable progress in the past decade. Numerous AI-based products have already become prevalent in the market, ranging from robotic surgical assistants to self-driving vehicles. The accuracy of AI systems has surpassed human capability in challenging tasks, such as face recognition (Taigman et al., 2014), lung cancer screening (Ardila et al., 2019) and pigmented skin lesion diagnosis (Tschandl et al., 2019). These practical applications of AI systems have prompted attention and support from industry, academia, and government.

Figure 1: Overview of our method for counterfactual image synthesis.

While AI technologies have contributed to increased work productivity and efficiency, a number of reports have also been made on the algorithmic biases and discrimination caused by data-driven decision making in AI systems. For example, COMPAS, an automated risk assessment tool used in criminal justice (Brennan et al., 2009), was reported to contain bias against Black defendants by assigning higher risk scores to Black defendants than White defendants (Angwin et al., 2019). Another recent study also reports the racial and gender bias in computer vision APIs for facial image analysis, which were shown less accurate on certain race or gender groups (Buolamwini and Gebru, 2018).

How can biased machine learning and computer vision models impact our society? We consider a following example. Let’s suppose an online search engine, such as Google, tries to make a list of websites of medical clinics and sort them by relevance. This list may be given to users as a search result or advertising content. The search algorithm will use content in websites to determine and rank their relevance, and any visual content, such as portraits of doctors, may be used as a feature in the pipeline. If the system relies on a biased computer vision model in this pipeline, the overall search results may also inherent the same biases and eventually affect users’ decision makings. Scholars have discussed and found present biases in online media such as skewed search results 

(Goldman, 2008) or gender difference in STEM career ads  (Lambrecht and Tucker, 2019), yet little has been known about mechanisms or origins of such biases.

While previous reports have shown that popular computer vision and machine learning models contain biases and exhibit disparate accuracies on different subpopulations, it is still difficult to identify true causes of these biases. This is because one cannot know to which variable or factor the model responds. If we wish to verify if a model indeed discriminates against a sensitive variable, e.g., gender, we need to isolate the factor of gender and intervene its value for counterfactual analysis (Hardt et al., 2016).

The objective of our paper is to adopt an encoder-decoder architecture for facial attribute manipulation (Lample et al., 2017) and generate counterfactual images which vary along the dimensions of sensitive attributes: gender and race. These synthesized examples are then used to measure counterfactual fairness of black-box image classifiers offered by commercial providers. Figure 1 shows the overall process of our approach. Given an input image, we detect a face and generate a series of novel images by manipulating the target sensitive attributes while maintaining other attributes. We summarize our main contributions as follows.

Figure 2: Illustrations of (left) our encoder-decoder architecture based on FaderNetwork (Lample et al., 2017) and (right) a GAN used by Denton et al. (Denton et al., 2019). Our model explicitly separates the sensitive attributes from the remaining representation encoded in . In both models, the discriminator is optimized by adversarial training.
  1. We propose to use an encoder-decoder network (Lample et al., 2017) to generate novel face images, which allows counterfactual interventions. Unlike previous methods (Denton et al., 2019), our method explicitly isolates the factors for sensitive attributes, which is critical in identifying true causes to model biases.

  2. We construct a novel image dataset which consists of 64,500 original images collected from web search and more than 300,000 synthesized images manipulated from the original images. These images describe people in diverse occupations and can be used for studies on bias measurement or mitigation. Both the code and data will be made publicly available.

  3. Using new methods and data, we measure counterfactual fairness of commercial computer vision classifiers and report whether and how sensitive these classifiers are affected along with attributes being manipulated by our model.

Related Work

ML and AI Fairness Fairness in machine learning has recently received much attention as a new criterion for model evaluation (Zemel et al., 2013; Hardt et al., 2016; Zafar et al., 2017; Kilbertus et al., 2017; Kusner et al., 2017). While the quality of a machine learning model has traditionally been assessed by its overall performance such as average classification accuracy measured from the entire dataset, the new fairness measures focus on the consistency of model behavior across distinct data segments or the detection of spurious correlations between target variables (e.g., loan approval) and protected attributes (e.g., race or gender).

The existing literature identifies a number of definitions and measures for ML/AI fairness (Corbett-Davies and Goel, 2018), including fairness through unawareness (Dwork et al., 2012), disparate treatment and disparate impact (Zafar et al., 2017), accuracy disparity (Buolamwini and Gebru, 2018), and equality in opportunity (Hardt et al., 2016). These are necessary because different definitions of fairness should be used in different tasks and contexts.

A common difficulty in measuring fairness is that it is challenging to identify or differentiate true causes of the discriminating model behaviors due to the input data that is built upon combination of many factors. Consequently, it is difficult to conclude that the variations in model outputs are solely caused by the sensitive or protected attributes. To overcome the limitation, Kusner et al. (Kusner et al., 2017) proposed the notion of counterfactual fairness based on causal inference. Here, a model, or predictor, is counterfactually fair as long as it produces an equal output to any input data whose values for the sensitive attribute are modified by an intervention but otherwise identical. Similar to (Kusner et al., 2017), our framework is based on counterfactual fairness to measure whether the prediction of the model differs by the intervened gender of the input image, while separating out the influences from all the other factors in the background.

Fairness and Bias in Computer Vision Fairness in computer vision is becoming more critical as many systems are being adapted in real world applications. For example, face recognition systems such as Amazon’s Rekognition are being used by law enforcement to identify criminal suspects (Harwell, 2019). If the system produces biased results (e.g., higher false alarm on Black suspects), then it may lead to a disproportionate arrest rate on certain demographic groups. In order to address this issue, scholars have attempted to identify biased representations of gender and race in public image dataset and computer vision models (Hendricks et al., 2018; Manjunatha et al., 2019; Kärkkäinen and Joo, 2019; McDuff et al., 2019). Buolamwini and Gebru (Buolamwini and Gebru, 2018) have shown that commercial computer vision gender classification APIs are biased and thus perform least accurately on dark-skinned female photographs. (Kyriakou et al., 2019) has also reported that image classification APIs may produce different results on faces in different gender and race. These studies, however, used the existing images without interventions, and thus it is difficult to identify whether the classifiers responded to the sensitive attributes or to the other visual cues. (Kyriakou et al., 2019) used the headshots of people with clean white background, but this hinders the classifiers from producing many comparable tags.

Our paper is most closely related to Denton et al. (Denton et al., 2019), who use a generative adversarial network (GAN) (Goodfellow et al., 2014) to generate face images to measure counterfactual fairness. Their framework incorporates a GAN trained from a face image dataset called CelebA (Liu et al., 2015), and generates a series of synthesized samples by modifying the latent code in the embedding space to the direction that would increase the strength in a given attribute (e.g., smile). Our paper differs from this work for the following reasons. First, we use a different method to examine the essential concept of counterfactual fairness by generating samples that separate the signals of the sensitive attributes out from the rest of the images. Second, our research incorporates the generated data to measure the bias of black-box image classification APIs whereas (Denton et al., 2019) measures the bias of a dataset open to public (Liu et al., 2015). Using our distinct method and data, we aim to identify the internal biases of models trained from unknown data.

Counterfactual Data Synthesis

Problem Formulation

The objective of our paper is to measure counterfactual fairness of a predictor , a function of an image . This predictor is an image classifier that automatically labels the content of input images. Without the loss of generality, we consider a binary classifier, This function classifies, for example, whether the image displays a doctor or not. We also define a sensitive attribute, , gender and race. Typically,

is a binary variable in the training data, but it can take a continuous value in our experiment since we can manipulate the value without restriction. Following

(Hardt et al., 2016), this predictor satisfies counterfactual fairness if for all and any and , where indicates an intervention on the sensitive attribute, . We now explain how this is achieved by an encoder-detector network.

The goal of this intervention is to manipulate an input image such that it changes the cue related to the sensitive attribute while retaining all the other signals. We consider two sensitive attributes: gender and race. We manipulate facial appearance because face is the strongest cue for gender and race identification (Moghaddam and Yang, 2002).

Figure 3: Our model controls for non-central attributes such as smiling and age. These attributes (e.g., mouth open) are fixed while the main attribute (race) is manipulated.

Counterfactual Data Synthesis

Before we elaborate our proposed method for manipulating sensitive attributes, we briefly explain why such a method is necessary to show if a model achieves counterfactual fairness. For an in-depth introduction to the framework of counterfactual fairness, we refer the reader to Kusner et al. (Kusner et al., 2017).

Many studies have reported skewed classification accuracy of existing computer vision models and APIs between gender and racial groups (Buolamwini and Gebru, 2018; Kyriakou et al., 2019; Kärkkäinen and Joo, 2019; Zhao et al., 2017). However, these findings are based on a comparative analysis, which directly compares the classifier outputs between male and female images (or White and non-White) in a given dataset. The limitation of the method is that it is difficult to identify true sources of biased model outputs due to hidden confounding factors. Even though one can empirically show differences between gender groups, such differences may have been caused by non-gender cues such as hair style or image backgrounds (see (Muthukumar et al., 2018), for example). Since there exists an infinite number of possible confounding factors, it will be very difficult to control for all of them.

Consequently, recent works in bias measurement or mitigation have adopted generative models which can synthesize or manipulate text or image data (Denton et al., 2019; Zmigrod et al., 2019). These methods generate hypothetical data in which only sensitive attributes are switched. These data can be used to measure counterfactual fairness but also augment samples in existing biased datasets.

Figure 4: Examples generated by our models, manipulated in (left) gender and (right) race.

Face Attribute Synthesis

From the existing methods available for face attribute manipulation (Yan et al., 2016; Bao et al., 2017; He et al., 2019), we chose FaderNetwork (Lample et al., 2017) as our base model. FaderNetwork is a computationally efficient model that produces plausible results, but we made a few changes to make it more suitable for our study.

Figure 2 illustrates the flows of our model and (Denton et al., 2019). The model used in (Denton et al., 2019) is based on a GAN that is trained without using any attribute labels. As in standard GANs, this model learns the latent code space from the training set. This space encodes various information such as gender, age, race, and any other cues necessary for generating a facial image. These factors are all entangled in the space, and thus it is hard to control only the sensitive attribute, which is required for the purpose of counterfactual fairness measurement. In contrast, FaderNetwork directly observes and exploits the sensitive attributes in training and makes its latent space invariant to them.

Specifically, FaderNetwork is based on an encoder-decoder network with two special properties. First, it separates the sensitive attribute, , from its encoder output, , and both are fed into the decoder, such that it can reconstruct the original image, i.e., . Second, it makes invariant to by using adversarial training such that the discriminator cannot predict the correct value for given . At test time, an arbitrary value for can be given to obtain an image with a modified attribute value.

Since we want to minimize the change by the model to dimensions other than the sensitive attributes, we added two additional steps as follows. First, we segment the facial skin region from an input face by (Yu et al., 2018)111 and only retain changes within the region. This prevents the model from affecting background or hair regions. Second, we control for the effects of other attributes (e.g., smiling or young) which may be correlated with the main sensitive attribute, such that their values remain intact while being manipulated. This was achieved by first modeling these attributes as the main sensitive attributes along with in training and fixing their values at testing time. This step may look unnecessary because the model is expected to separate all gender (or any other sensitive attributes) related information. However, it is important to note that the dataset used to train our model may also contain biases and it is hard to guarantee that its sensitive attributes are not correlated with other attributes. By enforcing the model to produce fixed outputs, we can explicitly control for those variables (similar ideas have been used in recent work on attribute manipulation (He et al., 2019)). Figure 3 shows the comparison between our model and the original FaderNetwork. This approach allows our model to minimize the changes in dimensions other than the main attribute being manipulated. Figure 4 shows randomly chosen results by our method.


Computer Vision APIs

We measured counterfactual fairness of commercial computer vision APIs which provide label classification for a large number of visual concepts, including Google Vision API, Amazon Rekognition, IBM Watson Visual Recognition, and Clarifai. These APIs are widely used in commercial products as well as academic research (Xi et al., 2019). While public computer vision datasets usually focus on general concepts (e.g., 60 common object categories in MS COCO (Lin et al., 2014)), these services generate very specific and detailed labels on thousands of distinct concepts. While undoubtedly useful, these APIs have not been fully verified for their fairness. They may be more likely to generate more “positive” labels for people in certain demographic groups. These labels may include highly-paid and competitive occupations such as “doctor” or “engineer” or personal traits such as “leadership” or “attractive”. We measure the sensitivity of these APIs using counterfactual samples generated by our models.

Occupational Images

We constructed the baseline data that can be used to synthesize samples. We are especially interested in the effects of gender and race changes on the profession related labels provided by the APIs, and thus collected a new dataset of images related to various professions. We first obtained a list of 129 job titles from the Bureau of Labor Statistics (BLS) website and used Google Image search to download images. Many keywords resulted in biased search results in terms of the gender and race ratio. To obtain more diverse images, we additionally combined six different keywords (male, female, African American, Asian, Caucasian, and Hispanic). This results in around 250 images per keyword. We disregarded images without any face.

Figure 5: The sensitivity of image classification APIs for Nurse and Scientist to the modified facial gender cues.

We also needed datasets for training our model. For the gender manipulation model, we used CelebA (Liu et al., 2015)

, which is a very popular face attribute dataset with 40 labels annotated for each face. This dataset mostly contains the faces of White people, and thus is not suitable for the race manipulation model. There is no publicly available dataset with a sufficiently large number of African Americans. Instead, we obtained the list of the names of celebrities for each gender and each ethnicity from an online website, FamousFix. Then we used Google Image search to download up to 30 images for each celebrity. We estimated the true gender and race of each face by a model trained from a public dataset 

(Kärkkäinen and Joo, 2019) and manually verified examples with lower confidences. Finally, this dataset was combined with CelebA to train the race manipulation model.

After training, two models (gender and race) were applied to the profession dataset to generate a series of manipulated images for each input image. If there are multiple faces detected in an image, we only manipulated the face closest to the center of it. These faces are pasted into the original image, only on the facial skin region, and passed to each of the 4 APIs we tested. All the APIs provide both the presence of each label (binary) and the continuous classification confidence if the concept is present in the image. Figure 4 shows example images manipulated in gender and race.

Figure 6: Example images and label prediction scores from APIs (G:Google, A:Amazon, I:IBM, C:Clarifai). “0” means the label was not detected. Blue labels indicate an increasing score with increasing masculinity (red for femininity). Some images were clipped to fit the space. Zoom in to see the details.


The sensitivity of a classifier with respect to the changes in gender or race cues of images is measured as a slope estimated from the assigned attribute value, , and the model output, , where is a synthesized image with its attribute manipulated to the value . The range of was set to . The center, i.e., gender-neutral face, is 0. is the range observed in training, and will extrapolate images beyond the training set. In practice, this still results in natural and plausible samples. From this range, we sampled 7 evenly spaced images for gender manipulation and 5 images for race manipulation.222We reduced the number from 7 to 5 as this was more cost effective and sufficient to discover the correlation between the attributes and output labels. Let us denote , the -th input image, and , the set of synthesized images (). For each label in , we obtain 7 scores. From the entire image set

, we obtain a normalized classifier output vector:

That is, we normalize the vector such that is always 1 to allow comparisons across concepts. The slope

is obtained by linear regression with ordinary least squares. The magnitude of

determines the sensitivity of the classifier against , and its sign indicates the direction.

Table 1 and 2 show the list of labels returned by each API, more frequently activated with images manipulated to be closer to women and to men, respectively. Not surprisingly, we found the models behave in a closely related way to the actual gender gap in many occupations such as nurses or scientists (see Figure 5, too). One can imagine this bias was induced at least in part due to the bias in the online media and web, from which the commercial models have been trained. Table 3 and 4 show skewed gender and race representations in our main dataset of peoples’ occupations. Indeed, many occupations such as nurse or engineer exhibit very sharp gender contrast, and this may explain the behaviors of the image classifiers. Figure 6 shows example images and their label prediction scores.333The APIs output a binary decision and a prediction confidence for each label. Our analysis is based on binary values (true or false), and we found that using confidence scores makes little difference in the final results.

Similarly, Table 5 and 6 show the labels which are most sensitive to the race manipulation. The tables show all the dimensions which are significantly correlated with the model output (), except plain concepts such as ”Face” or ”Red color”. We found the APIs are in general less sensitive to race change than gender change.

API Label Slope
Amazon Nurse -0.031
Google Fashion model -0.262
Google Model -0.261
Google Secretary -0.14
Google Nurse -0.073
IBM anchorperson -0.213
IBM television reporter -0.155
IBM college student -0.151
IBM legal representative -0.147
IBM careerist -0.128
IBM host -0.125
IBM steward -0.11
IBM Secretary of State -0.107
IBM gynecologist -0.099
IBM celebrity -0.097
IBM newsreader -0.09
IBM cleaning person -0.081
IBM nurse -0.046
IBM laborer -0.044
IBM workman -0.041
IBM entertainer -0.04
Clarifai secretary -0.273
Clarifai receptionist -0.268
Clarifai model -0.211
Clarifai shopping -0.058
Table 1: The Sensitivity of Label Classification APIs against Gender Manipulation (Female). (Only showing labels with p-value 0.001 and slope 0.03).
API Label Slope
Amazon Attorney .113
Amazon Executive .055
Google Blue-collar worker .056
Google Spokesperson .040
Google Engineer .038
IBM scientist .254
IBM sociologist .213
IBM investigator .174
IBM sports announcer .164
IBM resident commissioner .159
IBM repairer .151
IBM Representative .142
IBM cardiologist .140
IBM high commissioner .134
IBM security consultant .131
IBM speaker .122
IBM internist .119
IBM Secretary of the Int. .114
IBM biographer .109
IBM military officer .107
IBM radiologist .082
IBM detective .081
IBM diplomat .063
IBM contractor .061
IBM player .061
IBM medical specialist .050
IBM official .049
IBM subcontractor .043
Clarifai film director .342
Clarifai machinist .192
Clarifai writer .153
Clarifai repairman .125
Clarifai surgeon .087
Clarifai inspector .085
Clarifai waiter .082
Clarifai worker .078
Clarifai scientist .070
Clarifai singer .056
Clarifai musician .056
Clarifai construction worker .053
Clarifai police .054
Clarifai athlete .048
Clarifai politician .037
Table 2: The Sensitivity of Label Classification APIs against Gender Manipulation (Male). (Only showing labels with p-value 0.001 and slope 0.03).
Occupation Female % Occupation Male %
nutritionist .921 pest control worker .971
flight attendant .891 handyman .964
hair stylist .884 logging worker .950
nurse .860 basketball player .925
medical assistant .847 businessperson .920
dental assistant .835 chief executive officer .917
merchandise displayer .821 lawn service worker .909
nursing assistant .821 electrician .901
dental hygienist .815 barber .901
veterinarian .784 repair worker .900
fashion designer .775 sales engineer .889
occupational therapy asst. .772 construction worker .887
libarian .770 maintenance worker .882
office assistant .759 officer .882
receptionist .745 radio operator .871
travel agent .734 music director .868
medical transcriptionist .732 software developer .857
preschool teacher .730 golf player .855
teacher assistant .728 CTO .846
counselor .728 mechanic .836
Table 3: Skewed gender representations in Google Image search result
Occupation White % Occupation White %
historian .885 basketball player .200
building inspector .875 farmworker .231
funeral director .852 ahtlete .415
construction inspector .847 software developer .429
glazier .846 product promoter .451
legislator .840 interpreter .457
animal trainer .839 barber .457
boiler installer .836 medical assistant .459
jailer .823 food scientist .462
judge .822 chemical engineer .488
handyman .821 database administrator .489
baker .818 computer network architect .500
firefighter .815 industrial engineer .513
veterinarian .811 driver .514
pilot .797 bus driver .532
optician .795 fashion designer .539
businessperson .793 security guard .539
CFO .791 mechanic .548
maintenance manager .791 nurse .548
secretary .785 cashier .549
Table 4: Skewed race representations in Google Image search result
API Label Slope
IBM woman orator -0.69
IBM President of the U.S. -0.367
IBM first lady -0.323
IBM high commissioner -0.284
IBM Representative -0.225
IBM scientist -0.183
IBM worker -0.131
IBM resident commissioner -0.116
IBM sociologist -0.099
IBM analyst -0.09
IBM call center -0.085
IBM diplomat -0.085
Clarifai democracy -0.09
Clarifai musician -0.063
Clarifai singer -0.046
Clarifai cheerful -0.044
Clarifai happiness -0.034
Clarifai music -0.033
Clarifai confidence -0.032
Table 5: The Sensitivity of Label Classification APIs against Race Manipulation (Black). (Only showing labels with p-value 0.001 and slope 0.03).
API Label Slope
IBM careerist 0.179
IBM dermatologist (doctor) 0.127
IBM legal representative 0.111
IBM business man 0.093
IBM entertainer 0.034
Clarifai repair 0.074
Clarifai beautiful 0.074
Clarifai repairman 0.073
Clarifai writer 0.054
Clarifai physician 0.053
Clarifai work 0.051
Clarifai professional person 0.05
Clarifai contractor 0.05
Clarifai fine-looking 0.048
Clarifai skillful 0.044
Clarifai pretty 0.039
Google Beauty 0.06
Table 6: The Sensitivity of Label Classification APIs against Race Manipulation (White). (Only showing labels with p-value 0.001 and slope 0.03).


AI fairness is an increasingly important criterion to evaluate models and systems. In real world applications, especially for private models whose training processes or data are unknown, it is difficult to identify their biased behaviors or to understand the underlying causes. We introduced a novel method based on facial attribute manipulation by an encoder-decoder network to synthesize counterfactual samples, which can help isolate the effects of the main sensitive variables on the model outcomes. Using this methodology, we were able to identify hidden biases of commercial computer vision APIs on gender and race. These biases, likely caused by the skewed representation in online media, should be adequately addressed in order to make these services more reliable and trustworthy.


This work was supported by the National Science Foundation SMA-1831848.


  • J. Angwin, J. Larson, L. Kirchner, and S. Mattu (2019) Machine bias. ProPublica. Cited by: Introduction.
  • D. Ardila, A. P. Kiraly, S. Bharadwaj, B. Choi, J. J. Reicher, L. Peng, D. Tse, M. Etemadi, W. Ye, G. Corrado, et al. (2019)

    End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography

    Nature medicine 25 (6), pp. 954. Cited by: Introduction.
  • J. Bao, D. Chen, F. Wen, H. Li, and G. Hua (2017) CVAE-gan: fine-grained image generation through asymmetric training. In Proceedings of the IEEE International Conference on Computer Vision, pp. 2745–2754. Cited by: Face Attribute Synthesis.
  • T. Brennan, W. Dieterich, and B. Ehret (2009) Evaluating the predictive validity of the compas risk and needs assessment system. Criminal Justice and Behavior 36 (1), pp. 21–40. Cited by: Introduction.
  • J. Buolamwini and T. Gebru (2018) Gender shades: intersectional accuracy disparities in commercial gender classification. In Conference on fairness, accountability and transparency, pp. 77–91. Cited by: Introduction, Related Work, Related Work, Counterfactual Data Synthesis.
  • S. Corbett-Davies and S. Goel (2018) The measure and mismeasure of fairness: a critical review of fair machine learning. arXiv preprint arXiv:1808.00023. Cited by: Related Work.
  • E. Denton, B. Hutchinson, M. Mitchell, and T. Gebru (2019) Detecting bias with generative counterfactual face attribute augmentation. arXiv preprint arXiv:1906.06439. Cited by: Figure 2, item 1, Related Work, Counterfactual Data Synthesis, Face Attribute Synthesis.
  • C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel (2012) Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference, pp. 214–226. Cited by: Related Work.
  • E. Goldman (2008) Search engine bias and the demise of search engine utopianism. In Web Search, pp. 121–133. Cited by: Introduction.
  • I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio (2014) Generative adversarial nets. In Advances in neural information processing systems, pp. 2672–2680. Cited by: Related Work.
  • M. Hardt, E. Price, N. Srebro, et al. (2016)

    Equality of opportunity in supervised learning

    In Advances in neural information processing systems, pp. 3315–3323. Cited by: Introduction, Related Work, Related Work, Problem Formulation.
  • D. Harwell (2019) Oregon became a testing ground for amazon’s facial-recognition policing. but what if rekognition gets it wrong. Washington Post. Cited by: Related Work.
  • Z. He, W. Zuo, M. Kan, S. Shan, and X. Chen (2019) Attgan: facial attribute editing by only changing what you want. IEEE Transactions on Image Processing 28 (11), pp. 5464–5478. Cited by: Face Attribute Synthesis, Face Attribute Synthesis.
  • L. A. Hendricks, K. Burns, K. Saenko, T. Darrell, and A. Rohrbach (2018) Women also snowboard: overcoming bias in captioning models. In European Conference on Computer Vision, pp. 793–811. Cited by: Related Work.
  • K. Kärkkäinen and J. Joo (2019) FairFace: face attribute dataset for balanced race, gender, and age. arXiv preprint arXiv:1908.04913. Cited by: Related Work, Counterfactual Data Synthesis, Occupational Images.
  • N. Kilbertus, M. R. Carulla, G. Parascandolo, M. Hardt, D. Janzing, and B. Schölkopf (2017) Avoiding discrimination through causal reasoning. In Advances in Neural Information Processing Systems, pp. 656–666. Cited by: Related Work.
  • M. J. Kusner, J. Loftus, C. Russell, and R. Silva (2017) Counterfactual fairness. In Advances in Neural Information Processing Systems, pp. 4066–4076. Cited by: Related Work, Related Work, Counterfactual Data Synthesis.
  • K. Kyriakou, P. Barlas, S. Kleanthous, and J. Otterbacher (2019) Fairness in proprietary image tagging algorithms: a cross-platform audit on people images. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 13, pp. 313–322. Cited by: Related Work, Counterfactual Data Synthesis.
  • A. Lambrecht and C. Tucker (2019) Algorithmic bias? an empirical study of apparent gender-based discrimination in the display of stem career ads. Management Science 65 (7), pp. 2966–2981. Cited by: Introduction.
  • G. Lample, N. Zeghidour, N. Usunier, A. Bordes, L. Denoyer, and M. Ranzato (2017) Fader networks: manipulating images by sliding attributes. In Advances in Neural Information Processing Systems, pp. 5967–5976. Cited by: Figure 2, item 1, Introduction, Face Attribute Synthesis.
  • T. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick (2014) Microsoft coco: common objects in context. In European conference on computer vision, pp. 740–755. Cited by: Computer Vision APIs.
  • Z. Liu, P. Luo, X. Wang, and X. Tang (2015) Deep learning face attributes in the wild. In Proceedings of the IEEE international conference on computer vision, pp. 3730–3738. Cited by: Related Work, Occupational Images.
  • V. Manjunatha, N. Saini, and L. S. Davis (2019) Explicit bias discovery in visual question answering models. In

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    pp. 9562–9571. Cited by: Related Work.
  • D. McDuff, S. Ma, Y. Song, and A. Kapoor (2019) Characterizing bias in classifiers using generative models. arXiv preprint arXiv:1906.11891. Cited by: Related Work.
  • B. Moghaddam and M. Yang (2002) Learning gender with support faces. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (5), pp. 707–711. Cited by: Problem Formulation.
  • V. Muthukumar, T. Pedapati, N. Ratha, P. Sattigeri, C. Wu, B. Kingsbury, A. Kumar, S. Thomas, A. Mojsilovic, and K. R. Varshney (2018) Understanding unequal gender classification accuracy from face images. arXiv preprint arXiv:1812.00099. Cited by: Counterfactual Data Synthesis.
  • Y. Taigman, M. Yang, M. Ranzato, and L. Wolf (2014) Deepface: closing the gap to human-level performance in face verification. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1701–1708. Cited by: Introduction.
  • P. Tschandl, N. Codella, B. N. Akay, G. Argenziano, R. P. Braun, H. Cabo, D. Gutman, A. Halpern, B. Helba, R. Hofmann-Wellenhof, et al. (2019) Comparison of the accuracy of human readers versus machine-learning algorithms for pigmented skin lesion classification: an open, web-based, international, diagnostic study. The Lancet Oncology. Cited by: Introduction.
  • N. Xi, D. Ma, M. Liou, Z. C. Steinert-Threlkeld, J. Anastasopoulos, and J. Joo (2019) Understanding the political ideology of legislators from social media images. arXiv preprint arXiv:1907.09594. Cited by: Computer Vision APIs.
  • X. Yan, J. Yang, K. Sohn, and H. Lee (2016) Attribute2image: conditional image generation from visual attributes. In European Conference on Computer Vision, pp. 776–791. Cited by: Face Attribute Synthesis.
  • C. Yu, J. Wang, C. Peng, C. Gao, G. Yu, and N. Sang (2018) Bisenet: bilateral segmentation network for real-time semantic segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 325–341. Cited by: Face Attribute Synthesis.
  • M. B. Zafar, I. Valera, M. G. Rogriguez, and K. P. Gummadi (2017) Fairness constraints: mechanisms for fair classification. In Artificial Intelligence and Statistics, pp. 962–970. Cited by: Related Work, Related Work.
  • R. Zemel, Y. Wu, K. Swersky, T. Pitassi, and C. Dwork (2013) Learning fair representations. In International Conference on Machine Learning, pp. 325–333. Cited by: Related Work.
  • J. Zhao, T. Wang, M. Yatskar, V. Ordonez, and K. Chang (2017) Men also like shopping: reducing gender bias amplification using corpus-level constraints. In

    Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

    pp. 2979–2989. Cited by: Counterfactual Data Synthesis.
  • R. Zmigrod, S. J. Mielke, H. Wallach, and R. Cotterell (2019) Counterfactual data augmentation for mitigating gender stereotypes in languages with rich morphology. arXiv preprint arXiv:1906.04571. Cited by: Counterfactual Data Synthesis.