PrivacyNet: Semi-Adversarial Networks for Multi-attribute Face Privacy

by   Vahid Mirjalili, et al.

In recent years, the utilization of biometric information has become more and more common for various forms of identity verification and user authentication. However, as a consequence of the widespread use and storage of biometric information, concerns regarding sensitive information leakage and the protection of users' privacy have been raised. Recent research efforts targeted these concerns by proposing the Semi-Adversarial Networks (SAN) framework for imparting gender privacy to face images. The objective of SAN is to perturb face image data such that it cannot be reliably used by a gender classifier but can still be used by a face matcher for matching purposes. In this work, we propose a novel Generative Adversarial Networks-based SAN model, PrivacyNet, that is capable of imparting selective soft biometric privacy to multiple soft-biometric attributes such as gender, age, and race. While PrivacyNet is capable of perturbing different sources of soft biometric information reliably and simultaneously, it also allows users to choose to obfuscate specific attributes, while preserving others. The results from extensive experiments on five independent face image databases demonstrate the efficacy of our proposed model in imparting selective multi-attribute privacy to face images.



There are no comments yet.


page 5

page 7

page 9

page 10

page 11


FlowSAN: Privacy-enhancing Semi-Adversarial Networks to Confound Arbitrary Face-based Gender Classifiers

Privacy concerns in the modern digital age have prompted researchers to ...

Soft Biometric Privacy: Retaining Biometric Utility of Face Images while Perturbing Gender

While the primary purpose for collecting biometric data (such as face im...

Anonymizing k-Facial Attributes via Adversarial Perturbations

A face image not only provides details about the identity of a subject b...

Does a Face Mask Protect my Privacy?: Deep Learning to Predict Protected Attributes from Masked Face Images

Contactless and efficient systems are implemented rapidly to advocate pr...

FlowSAN: Privacy-enhancing Semi-AdversarialNetworks to Confound Arbitrary Face-basedGender Classifiers

Privacy concerns in the modern digital age have prompted researchers to ...

Semi-Adversarial Networks: Convolutional Autoencoders for Imparting Privacy to Face Images

In this paper, we design and evaluate a convolutional autoencoder that p...

Semi-Adversarial Networks: Convolutional Autoencoders for Imparting Privacyto Face Images

In this paper, we design and evaluate a convolutional autoencoder that p...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Face recognition has been widely used in several applications, including surveillance [1, 2, 3, 4] and border patrol [5, 6, 7]. Also, with the introduction of face ID applications in hand-held devices, face recognition has become ubiquitous in everyday use. In the modern age of technology, face recognition has been well established as a primary layer of security for protecting and accessing personal and sensitive data [8].

The primary purpose of collecting and storing face images in a biometric system is the recognition of individuals. Yet, face images stored in a database contain auxiliary information about each individual in the database [8]. These auxiliary information are commonly referred to as soft biometric attributes, which include gender, age, ethnicity, race, body mass index, and health characteristics [9].

Soft biometric attributes can facilitate a large variety of applications [8], such as improving face recognition performance [10], profiling users, and developing targeted advertisements [11]

. Recent advances in machine learning have made it possible to extract such soft biometric attributes from face images automatically  

[12, 13]. However, users of such biometric systems may prefer not to be profiled based on their demographic attributes and may wish to opt-out of such services due to privacy concerns. In this regard, certain privacy laws allow users to choose what information about themselves to reveal and what information to conceal [14, 15, 16]. Moreover, in the near future, a host of biometric applications is expected to implement best-practices with regard to respecting the privacy of users by preventing automatic information extraction from face images in the absence of the users’ consent [17, 18]. However, even if information is not extracted intentionally, user images stored in a database are still susceptible to privacy breaches from third party users or applications. Thus, to provide actionable means and guarantees for preventing the automatic mining of personal information from face images, recent research has explored the possibility of imparting soft biometric privacy to face images by modifying the image data directly [19, 20, 21, 22].

To provide a practical approach for providing gender privacy to face images, we previously developed the Semi-Adversarial Networks (SAN) model [23]. This SAN model is able to conceal gender information from face images while retaining satisfactory face matching accuracy. In later studies, improvements of the SAN model resulted in state-of-the-art matching performance with arbitrary face matchers under the constraint that arbitrary gender classifiers were not able to extract gender information from the modified face images [24, 25].

While the previously developed SAN model is only capable of hiding gender information, in this paper, we propose the PrivacyNet model for imparting multi-attribute privacy to face images including age, gender, and race. The overall objective of this work is to develop a model that can induce selective (which attributes to conceal) and collective (how many attributes to conceal) perturbations to impart soft biometric privacy to face imagess (Fig. 1) while retaining biometric matching performance.

Fig. 1: Illustration of the overall objective of this work: transforming an input face image across three orthogonal axes for imparting multi-attribute privacy selectively while retaining recognition utility. The abbreviated letters are M: Matching, G: Gender, A: Age, and R: Race.

2 Related Work

With recent advances in machine learning and deep learning for computer vision, the prediction of soft biometric attributes such as age, gender, and ethnicity from facial biometric data has been widely studied 

[26, 27, 28, 9, 29]

. For instance, the use of convolutional neural networks for predicting the gender from face images has resulted in models with almost perfect prediction accuracy 

[30, 31, 32, 33, 28]

. Methods for estimating the apparent age from face images are similarly well studied, and current-state of the art methods can predict the apparent age of a person with a prediction error below three years on average 

[26, 34, 35, 27].

While tremendous progress has been made towards the automatic extraction of personal attributes of face images, the development of methods and techniques for imparting soft biometric privacy is still a relatively recent area of research. In 2014, Othman and Ross introduced the concept of soft biometric privacy, where a face image is modified such that the gender information is confounded while the recognition utility of the face image is preserved [19]. The researchers proposed a face mixing approach, where a face image is morphed with a candidate face image from the opposite gender. As a result, the resulting mixed face image contains both male and females features such that the gender information was fully anonymized. Sim and Zhang then developed methods for imparting soft biometric privacy to multiple attributes based on multi-modal discriminant analysis, in which certain attributes can be selectively suppressed while retaining others [20]. They proposed a technique that decomposes a face image representation into orthogonal axes corresponding to gender, age, and ethnicity, and the identity information is left as a residual of this decomposition. This enables transforming a face image along one axis resulting in modifying the corresponding attribute, while other information of the face image remains visibly unchanged to the human eye. They also showed that their proposed method can alter identities of face images, which is useful for face de-identification [36, 37]. However, Sim and Zhang’s [20] method cannot explicitly preserve the matching performance of transformed face images, therefore, the biometric utility of the resulting face images is severely diminished.

In 2013, Szegedy et al. [38] studied the vulnerability of Deep Neural Networks (DNNs) towards adversarial perturbations. Adversarial perturbations are small perturbations added to an input image, typically imperceptible by a human observer, that can cause the DNN to misclassify images with high confidence. In recent years, several methods for generating such adversarial perturbations have been proposed, and the development towards methods that make DNN-based models more robust against these so-called adversarial attacks remains an active area of research [39, 40, 41, 42, 43, 44, 45]. The vulnerability to adversarial attacks raises several security concerns for the use of machine learning systems in computer vision applications [46, 18, 39, 47]. Recently, Rozsa et al. [48] investigated the robustness of machine learning applications for predicting the soft-biometric attributes from face images against adversarial attacks. Based on the concept of adding adversarial perturbations to an input image, Mirjalili and Ross [11] investigated the possibility of generating adversarial perturbations for imparting soft-biometric privacy to face images. This scheme was further extended by Chhabera et al. [21] to conceal multiple face attributes simultaneously. While these perturbation-based methods are shown to successfully derive adversarial examples based on a specific attribute classifier, the perturbed output images are not generalizable across unseen attribute classifiers. For a real-world privacy application, generalizability of adversarial examples to unseen attribute classifiers is critical [25].

Recently, methods have been developed that impart privacy through the design and use of specific face representation vectors, which have been derived from the original face images without including the sensitive information that is to be concealed 

[49, 50, 51, 52]. For instance, the SensitiveNet [51] model generates agnostic face representations for biometric recognition such that gender and race information are removed from these representations [53]. However, storing face representation vectors may not be desirable in many applications since these vectors are neither interpretable by humans nor compatible with existing biometric software. In this work, we develop a generally applicable method that applies perturbations to the face images directly instead of deriving representations.

In previous work [23], we developed a deep learning-based model to generate perturbed examples for obfuscating gender information in face images. The neural network was coined Semi-Adversarial Network (SAN) and is composed of a convolutional autoencoder for synthesizing face images such that the gender information in the synthesized images is obfuscated while their matching utility is preserved. The SAN model is trained using an auxiliary gender classifier and an auxiliary face matcher. After training, the auxiliary subnetworks are discarded and the convolutional autoencoder is used for performance evaluation on unseen data. It was shown that this model is able to suppress gender information as assessed by some unseen111In contrary to “auxiliary” classifiers, the term “unseen” indicates that the attribute classifier (or face matcher) was not used during the training stage. attribute classifiers while the matching utility, assessed by unseen face matchers, was retained. Moreover, the generalizability of SAN models to fool arbitrary gender classifiers can be further enhanced by diversifying the auxiliary classifiers during training [25] or by combining multiple, diverse SAN models [24].

Apart from imparting soft biometric privacy via SAN, Generative Adversarial Networks (GAN) [54, 55, 56]

and its variants have shown remarkable performance in many computer vision tasks such as image-to-image translation 

[57] and face image synthesis [58, 59, 60, 61]. However, GAN models are not considered as a viable solution for imparting soft biometric privacy since the objective of GAN-based models for image-to-image translation is to synthesize realistic face images while the biometric matching utility is not explicitly preserved.

While our previous work [24] has shown its efficacy in imparting gender privacy to face images, existing methods for imparting multi-attribute privacy are limited. As previously mentioned, the work of Chhabra et al. [21] is not generalizable to previously unseen attribute classifiers. Furthermore, the controllable face privacy model proposed by Sim and Zhang [20] cannot retain the recognition utility of the face images.

The main contribution of this work is the design of a multi-attribute face privacy model to provide controllable soft-biometric privacy. This proposed GAN-based privacy model, which we refer to as “PrivacyNet,” modifies an input face image to obfuscate soft-biometric attributes while maintaining the recognition capability on the generated face images. To the best of our knowledge, the “PrivacyNet” model proposed in this paper is the first method for multi-attribute privacy that generalizes to unseen attribute classifiers while preserving the recognition utility of face images.

3 Proposed method

Fig. 2: Schematic representation of the architecture of PrivacyNet for deriving perturbations to obfuscate three attribute classifiers, gender, age, and race, while allowing biometric face matchers to perform well. (A) Different components of the PrivacyNet: generator, source discriminator, attribute classifier, and auxiliary face matcher. (B) Cycle-consistency constraint applied to the generator by transforming an input face image to a target label and reconstructing the original version.

3.1 Problem Formulation

Given a face image , let be a set of face attributes to be obfuscated and be a set of attributes to be preserved. The overall objective is to perturb the input image such that the perturbed image has the following properties:

  • For a soft biometric attribute , the performance of an unseen attribute classifier is substantially reduced.

  • For the remaining set of attributes , the performance of an arbitrary classifier is not noticeably adversely affected; that is, the performance of an attribute classifier on perturbed image is close to its performance on the original face image .

  • The primary biometric utility, which is face recognition, must be retained for the modified face image, . In other words, given pairs of image examples before () and after () perturbations, the matching performance as assessed by an arbitrary face matcher () is not substantially affected, i.e.,

3.2 PrivacyNet

Fig. 3: The detailed neural network architectures of three sub-networks of PrivacyNet: the generator , the discriminators and , and the pre-trained auxiliary face matcher . Note that share the same convolutional layers and only differ in their respective output layers.

According to the objectives described in Section 3.1, the PrivacyNet neural network architecture (Fig. 2A) is composed of four sub-networks: A generator () that modifies the input image, a source discriminator () which determines if an image is real or modified, an attribute classifier () for predicting facial attributes, and an auxiliary face matcher () for biometric face recognition. Together, these subnetworks form a cycle-consistent GAN [59] as illustrated in Fig. 2B. Given an RGB input face image , the attribute label vector corresponds to the ground truth attribute labels of the original face image. The target label vector ( is the total number of attributes) denotes the desired facial attributes for modifying the face image. Given a target vector , the objective of the generator is to synthesize a new image such that is mapped to the target label vector by an attribute classifier . The other component of the GAN model is a source discriminator , which is trained to distinguish real images from those synthesized by the generator.

The total loss terms for training the discriminator () and the generator () are as follows:





terms are hyperparameters representing the relative weights for the corresponding loss terms. The individual loss terms of the total loss terms,

and , are described in the following paragraphs.

The loss terms for discriminating between real and synthesized are defined as


for the discriminator subnetwork and


for the generator subnetwork, respectively. According to these loss terms, the discriminator learns to distinguish synthesized images from real images. On the other hand, the generator learns to generate images that would increase the prediction error of the discriminator (by generating realistic faces images).

Similar to the previous loss terms for distinguishing between real and synthesized face images, the attribute classification terms are defined as




According to these loss terms, the generator will learn to produce face images that maximize the error of the attribute classifier while the attribute classifier is optimized to predict the original face attributes corresponding to the input face image .

The loss for optimizing the performance of the biometric face matcher on the perturbed images is defined as the squared L2 distance between the original face image and the synthesized version according to the modified attribute vector :


Lastly, a reconstruction loss term is used to form a cycle-consistent GAN that is able to reconstruct the original face image from its modified face image :


Note that the distance term in Eq. 8 is computed as the pixel-wise L1 norm between the original and modified image, which empirically results in less blurry images compared to employing a L2 norm as the distance measure [57].

3.3 Neural Network Architecture of PrivacyNet

The composition of the different neural networks used in PrivacyNet, generator , real vs. fake classifier , attribute classifier , and face matcher is described in Fig. 3. The generator and the discriminator architectures were adapted from [58] and [59], respectively.

Generator. The generator receives as input an RGB face image of size along with the target labels

concatenated as extra channels. The first two convolutional layers, with stride

, reduce the size of the input image to a to with 128 channels. The convolutional layers are followed by instance normalization layers (InstanceNorm) [62]

. The layer activations are computed by applying the non-linear ReLU activation function to the InstanceNorm outputs. Then,

residual blocks [63] are applied, followed by two transposed convolution for upsampling the image size to . Finally, the output image is constructed by a convolution layer and the hyperbolic tangent () activation function, which returns pixels in the range (the input image pixels are also scaled to be in range ).

Discriminator and Attribute Classifier. The discriminator, as shown in Fig. 3, combines the source discriminator and the attribute classifier into one network where all the layers except the last convolution layer are shared among the two tasks. All the shared convolution layers are followed by a Leaky ReLU non-linear activation with a small negative slope of . In the last layer, separate convolutional layers are used for the two tasks, where returns a scalar score for computing the loss according to Wasserstein GAN [64], and the

returns a vector of probabilities for each attribute class.

Face Matcher. Lastly, the auxiliary face matcher is adapted from the publicly available pre-trained VGG-Face CNN model that receives input face images of size and computes their face descriptors of size  [65].

3.4 Datasets

We have used five datasets in this study: CelebA [66], MORPH [67], MUCT [68], RaFD [69], and UTK-face [70]. Table I

shows the number of examples in each dataset, including the number of examples for each face attribute. Since the race label distribution in CelebA is heavily skewed towards Caucasians, whereas MORPH is heavily skewed towards persons with African ancestry, we combined CelebA and MORPH for training. Both the CelebA and MORPH datasets are split into training and evaluation sets in a subject-disjoint manner. The two training subsets from CelebA and MORPH are merged to train the PrivacyNet model with a relatively balanced race distribution. The other three datasets, MUCT, RaFD, and UTK-face are used only for evaluation. While all five datasets provide provide binary attribute gender labels 

222In this paper we treat gender as a binary attribute with two labels, male and female; however, it must be noted that societal and personal interpretation of gender can result in many more classes. Facebook, for example, suggests over 58 gender classes:, each dataset lacks the ground-truth labels for at least one of the other attributes, age or race.

Dataset Gender Race Age groups
Male Female African-descent Caucasian Young Midle-aged Old
CelebA 84,434 118,165 11,119 191,480 1,020 1,181 595
MORPH 47,057 8,551 42,897 12,711 3,733 3,468 861
MUCT 1,844 1,910 1,030 2,724 973 870 665
RaFD 1,008 600 0 1,608 1,250 356 2
UTK-face 12,582 11,522 4,558 19,546 5,384 4,856 3,945
TABLE I: Overview of datasets used in this study, with the number of face images corresponding to each attribute.

Gender Attribute: All the five datasets considered in this study provide ground-truth labels for the gender attribute. Furthermore, since gender is a well-studied topic, there are several face-based gender predictors available for evaluation. In this study, we have considered three gender classifiers for evaluation: a commercial-off-the-shelf software G-COTS, IntraFace [71], and AFFACT [28].

Race Labels: We consider binary labels for race: Caucasians and African descent. Samples that do not belong to these two race groups are omitted from our study since the other race groups are under-represented in our training datasets. We have used the ground-truth labels provided in the MORPH and UTK-face datasets, but for the other three datasets, we have labeled the samples in multiple stages. First, an initial estimate of the race attribute is computed using commercial software R-COTS. Next, the predictions made by R-COTS from all samples of the same subject are aggregated, and subjects that show discrepant predictions for different samples are visualized and the discrepant labels are manually corrected. Finally, one random sample from every subject is visually inspected to verify the predicted label. Furthermore, note that since RaFD did not have any sample from the African descent race group, we removed this dataset from race prediction analysis.

Age Information: The ground-truth age information is only provided in the MORPH and UTK-face datasets. Therefore, for the remaining datasets (CelebA, MUCT, and RaFD) we used the commercial-off-the-shelf A-COTS software to obtain the class labels of the original images. For the evaluation of our proposed model, we use the Mean Absolute Error (MAE) metric to measure the change in age prediction of the output images of PrivacyNet concerning the age predictions for the original face images. Therefore, the combination of all five datasets shows both changes in age prediction with respect to the original (for CelebA, MUCT, and RaFD) as well as the ground-truth age values (for MORPH and UTK-face datasets). For training the PrivacyNet model, we create three age groups based on the age values


We understand that creating age groups will not preserve the relative textural changes of face aging that occurs due to the non-stationary nature of patterns in face aging process [35, 26]. However, this scheme is consistent with the treatment of the other two attributes, gender, and age. Also, it should be emphasized that our objective is not to synthesize face images in particular ages (which is known as age synthesis); instead, the goal of the proposed method is to disturb the performance of arbitrary age predictors.

Identity Information: For matching analysis, we exclude the UTK-face dataset since the subject information is not provided. We have used three face matchers, a commercial-off-the-shelf software M-COTS, and two publicly available face matchers DR-GAN [72] and SE-ResNet-50 [73] (SE-Net for short) which were trained on VGGFace2 dataset [74].

A summary of the datasets and the number of subjects and samples is provided in Table II.

Datasets Train Test Excluded
# Subj # Samples # Subj # Samples Experiments
CelebA 8,604 167 2,795
MORPH 1,968 8,038
MUCT 185 2,508
RaFD 67 1,608 Race
UTK-face NA Matching
TABLE II: Summary of the datasets used in this study, with their number of subjects and samples in the train-test splits; The excluded experiments indicate the dataset is removed from the experiment for the reasons given in the text.

4 Experimental Results

The proposed PrivacyNet model is trained on the joint training subsets of CelebA and MORPH as explained in Section 3.4. Due to the memory-intensive training process, we used a batch-size of . The models were trained for iterations. The optimal hyperparameter settings for the weighting coefficients of the attribute loss terms were and . The matching term coefficient was set to , and the hyperparameter for the reconstruction term was set to . After training the PrivacyNet model, both the discriminator and the auxiliary face matcher subnetworks are discarded and only the generator is used for transforming the unseen face images in the evaluation datasets.

Additionally, we also trained a cycle-GAN model [58] without the auxiliary face matcher as a baseline to study the effects of the face matcher. The cycle-GAN model is trained using the same protocol that was described for train PrivacyNet. In the remainder of this paper, we will refer to this method as “baseline-GAN”. The transformations of five different example images from the CelebA-test dataset are shown in Fig. 4.

Fig. 4: Five example face images from the CelebA dataset along with their transformed versions using PrivacyNet and baseline GAN [58] models. The rows are marked by their selected attributes: G: gender, R: race, and A: age, where the specific target age group is specified with A0 (young), A1 (middle-aged), and A2 (old).

The following subsections summarize the results of from the experiments and analyze how the performances of the attribute classifiers and face matchers are affected by the face attribute perturbations via PrivacyNet.

4.1 Perturbing Facial Attributes

The performance assessment of the proposed PrivacyNet model involves three objectives:

  1. when an attribute is selected to be perturbed, the performance of unseen attribute classifiers must be decreased;

  2. the attribute classifiers should retain their performance on attributes that are not selected for perturbation;

  3. in all cases, the performance of unseen face matchers must be retained;

We conducted several experiments to assess whether the proposed PrivacyNet model meets these objectives.

Gender Classification Performance. We have considered three gender classifiers: a commercial-off-the-shelf software (G-COTS), AFFACT [28] and IntraFace [71]. For this comparison study, all five evaluation datasets that are listed in Table II were considered. The performances of the different gender classifiers on the original and perturbed images are measured using the Equal Error Rate (EER); the results are shown in Fig. 5. For a given image, PrivacyNet can produce up to 15 distinct outputs, depending on the combination of attributes that are selected for perturbation.

Fig. 5: Performance of three gender classifiers G-COTS, AFFACT, and IntraFace on original images as well as different outputs of the proposed model (the larger the difference the better). The results of a face mixing approach , as described in [19] are also shown. Different outputs are marked by their selected attributes: G: gender, R: race, and A: age where the specific target age group abbreviated as A0 (young), A1 (middle-aged), and A2 (old). The outputs of PrivacyNet where the gender attribute is selected for perturbation are shown in orange, and the rest are shown in blue.

The EER results shown in Fig. 5 indicate that PrivacyNet increases the error rate of the cases where the gender attribute is willfully perturbed, which is desired. At the same time, it can preserve the performance of gender classifiers when gender information is to be retained. The EER of gender classification using G-COTS software on gender-perturbed outputs has increased to 20-40%, and the EER of gender classification using AFFACT and IntraFace on those outputs has surpassed 60%. Comparisons between the gender prediction results on the outputs of PrivacyNet and the outputs of the face-mixing approach by Othman and Ross [19], as well as the model by Sim and Zhang [20], show that in case of G-COTS, PrivacyNet results are superior to the reference works in terms of increasing the EER (Fig. 5).

Note that we did not include the results of the GAN model in Fig. 5 to improve readability. However, we observed that the GAN model has shown larger deviations (which is advantageous) in cases where gender was intended to be perturbed. This is expected since the GAN model does not have the constraints from the auxiliary face matcher. Therefore, there is more flexibility for modifying the patterns of the face. However, a disadvantage of the GAN model is that it also significantly degrades the matching utility as shown in Section 4.2.

Race Prediction Performance. We conducted the race prediction analysis using commercial-off-the-shelf software, R-COTS. Similar to the gender classification experiments, we have shown the EER of race classification on original images as well as different outputs of PrivacyNet model in Fig. 6. Since the face mixing approach proposed in [19] was only formulated for gender and not race perturbations, we did not include it in this section.

Fig. 6: Performance of the race classifier R-COTS on original images as well as different outputs of the proposed model. Different outputs are marked by their selected attributes: G: gender, R: race, and A: age where the specific target age group is shown with A0, A1, and A2 (the larger the difference the better). The outputs of PrivacyNet where the race attribute is selected for perturbation are shown in orange, and the rest are shown in blue.

The EER results in Fig. 6 show that PrivacyNet successfully meets the objectives of our study for confounding race predictors. The outputs where race is not intended to be perturbed (shown in blue) show low EER values similar to the EER obtained from the original images (). On the other hand, when race is selected to be perturbed, the EER values increase significantly ( for CelebA and UTK-face, and for MORPH and MUCT datasets). The results of separately perturbing gender and race using the controllable face privacy method proposed in [20] are also shown for comparison. When the race attribute is perturbed according to [20], the performance is slightly higher than our model. However, the disadvantage of the controllable face privacy method [20] is that when it perturbs the gender attribute, it also affects the race predictions.

Age Prediction Performance. To assess the ability of PrivacyNet for confounding age information, we used a commercial-off-the-shelf age predictor (A-COTS), which has shown remarkable performance across the different datasets tested in this study (Fig. 7). We used the Mean Absolute Error values in unit of years to measure the change in age prediction before and after perturbing the images (Fig. 7). As mentioned previously (Section 3.4), the ground-truth age values for three datasets, CelebA, MUCT, and RaFD are not provided. Therefore, for these three datasets, the MAE values are computed as the difference between the age predictions on the output images and the predictions on the original images, while for the other two datasets, MORPH and UTK-face, the ground-truth values are used for computing the MAE values.

Fig. 7: Change in age predictions made by A-COTS on different outputs of the proposed model with respect to the age prediction on original images for CelebA, MUCT and RaFD and the ground-truth age values for MORPH and UTK-face datasets. Different outputs are marked by their selected attributes: G: gender, R: race, and A: age where the specific target age group is shown with A0, A1 and A2. The outputs of PrivacyNet where the age attribute is selected for perturbation are shown in orange, and the rest are shown in blue.

The results of age-prediction show that when the MAE obtained from the outputs where age is not meant to be perturbed remains at approximately 5 years. However, when we intend to modify the age of face images, using label A2 results in the highest MAE (around 20 years for RaFD and 15 years for the other four datasets) compared to A0 and A1. A possible explanation for this observation is that, due the nature of the aging process, larger textural changes occur in face images belonging to A2. The MAE of the A0 group is also relatively large (except for RaFD), which may be caused by the reversal of the textural changes. However, the results of the middle age group (A1) is similar to the cases where we did not intend to modify the age. We hypothesize that the small changes in A0 are also due to the non-stationary aging patterns; the age perturbations via the PrivacyNet model can potentially be improved by using an ordinal regression approach for age prediction.

4.2 Retaining the Matching Utility of Face Images

Besides obfuscating soft-biometric attributes in face images, another objective of this work is to retain the recognition utility of all outputs of PrivacyNet. For this purpose, we have conducted matching experiments using three unseen face matchers: commercial-off-the-shelf software (M-COTS) and two publicly available matchers, SE-ResNet-50 trained on the VGGFace2 dataset [74] (SE-Net for short), and DR-GAN [72]. Fig. 8 shows the ROC curves obtained from these matching experiments for four datasets, CelebA, MORPH, MUCT, and RaFD. The UTK-face dataset is removed from this analysis since it does not contain subject information. Since PrivacyNet generated 15 outputs for each input face image, the minimum and maximum True Match Rate (TMR) values at each False Match Rate (FMR) are computed and only the range of values for these 15 outputs are shown. Note that it is expected that the matching utility is retained in all these 15 outputs. Similarly, the range of TMR values at each FMR obtained from the 15 different outputs of the GAN model that did not have auxiliary face matcher for training is also shown for comparison. The ROC curves of PrivacyNet are very close to the ones obtained from the original images for each dataset, compared to the baseline results, which both show significantly larger deviations.

In addition to the ROC curves, we have also plotted the Cumulative Match Characteristics (CMC) [75], as shown in Fig. 9. According to the CMC curves, the results of PrivacyNet match very closely with the CMC curves obtained from the original images in all cases, which shows that PrivacyNet retains the matching utility.

Fig. 8: ROC curves showing the performance of unseen face matchers on the original images compared with PrivacyNet, the baseline-GAN model [58], face mixing [19] and the controllable face privacy [20] method. The results show that ROC curves of PrivacyNet have the smallest deviation from the ROC curve of original images meaning that the performance of face matches is minimally impacted, which is desired.
Fig. 9: CMC curves showing the identification accuracy of the unseen face matchers on original images as well as the range of CMC curves of both the PrivacyNet model and the baseline-GAN model [58], along with that of face mixing [19] and controllable face privacy [20] approaches. It shall be noted that in cases where the results of PrivacyNet or GAN are not shown, the curves overlapped with the CMC curve that was computed for the original images, which means that there was no change in matching performance at all (which is the optimal case). The results confirm that transformations made by PrivacyNet preserve the matching utility of face images.

5 Conclusions

In this work, we designed PrivacyNet, which is a deep neural network model for imparting multi-attribute privacy to face images including age, gender, and race attributes. PrivacyNet utilizes a Semi-Adversarial Network (SAN) module combined with Generative Adversarial Networks (GAN) to perturb an input face image, where certain attributes are perturbed selectively, while other face attributes are preserved. Most importantly, the matching utility of face images from these transformations is preserved. Experimental results using three unseen face matchers as well as three unseen attribute classifiers show the efficacy of our proposed model in perturbing such attributes, while the matching utility of face images is not adversely impacted.


We would like to thank Pranavan Theivendiram and Terence Sim for kindly providing a Python API for controllable face privacy [20], were used for the comparison studies. In addition, the authors would like to thank funding sources from NSF (Grant Number ), as well as computational resources that were acquired through funds from the University of Wisconsin Alumni Research Foundation.


  • [1] G. L. Foresti, C. Micheloni, L. Snidaro, and C. Marchiol, “Face detection for visual surveillance,” in 12th International Conference on Image Analysis and Processing, 2003. Proceedings.    IEEE, 2003, pp. 115–120.
  • [2] B. Kamgar-Parsi, W. Lawson, and B. Kamgar-Parsi, “Toward development of a face recognition system for watchlist surveillance,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 10, pp. 1925–1937, 2011.
  • [3] M. Grgic, K. Delac, and S. Grgic, “Scface–surveillance cameras face database,” Multimedia tools and applications, vol. 51, no. 3, pp. 863–879, 2011.
  • [4] H. Qezavati, B. Majidi, and M. T. Manzuri, “Partially covered face detection in presence of headscarf for surveillance applications,” in

    2019 4th International Conference on Pattern Recognition and Image Analysis (IPRIA)

    .    IEEE, 2019, pp. 195–199.
  • [5] T. Kwon and H. Moon, “Biometric authentication for border control applications,” IEEE Transactions on Knowledge and Data Engineering, vol. 20, no. 8, pp. 1091–1096, 2008.
  • [6] J. S. del Rio, D. Moctezuma, C. Conde, I. M. de Diego, and E. Cabello, “Automated border control e-gates and facial recognition systems,” Computers & Security, vol. 62, pp. 49–72, 2016.
  • [7] A. K. Bobak, A. J. Dowsett, and S. Bate, “Solving the border control problem: Evidence of enhanced face matching in individuals with extraordinary face recognition skills,” PloS one, vol. 11, no. 2, p. e0148148, 2016.
  • [8] A. Jain, A. A. Ross, and K. Nandakumar, Introduction to biometrics.    Springer Science & Business Media, 2011.
  • [9] A. Dantcheva, P. Elia, and A. Ross, “What else does your biometric data reveal? A survey on soft biometrics,” IEEE Transactions on Information Forensics and Security, vol. 11, no. 3, pp. 441–467, 2016.
  • [10] E. S. Jaha, “Augmenting gabor-based face recognition with global soft biometrics,” in 2019 7th International Symposium on Digital Forensics and Security (ISDFS).    IEEE, 2019, pp. 1–5.
  • [11] V. Mirjalili and A. Ross, “Soft biometric privacy: Retaining biometric utility of face images while perturbing gender,” in Proceedings of International Joint Conference on Biometrics (IJCB), 2017.
  • [12]

    A. Das, A. Dantcheva, and F. Bremond, “Mitigating bias in gender, age and ethnicity classification: A multi-task convolution neural network approach,” in

    Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 573–585.
  • [13] A. Dantcheva, C. Velardo, A. D’angelo, and J.-L. Dugelay, “Bag of soft biometrics for person identification,” Multimedia Tools and Applications, vol. 51, no. 2, pp. 739–777, 2011.
  • [14] E. J. Kindt, Privacy and data protection issues of biometric applications.    Springer, 2013.
  • [15] Y. Wu, F. Yang, Y. Xu, and H. Ling, “Privacy-Protective-GAN for privacy preserving face de-identification,” Journal of Computer Science and Technology, vol. 34, no. 1, pp. 47–60, 2019.
  • [16] Q. Sun, A. Tewari, W. Xu, M. Fritz, C. Theobalt, and B. Schiele, “A hybrid model for identity obfuscation by face replacement,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 553–569.
  • [17] B. Medcn, P. Peer, and V. Struc, “Selective face deidentification with end-to-end perceptual loss learning,” in 2018 IEEE International Work Conference on Bioinspired Intelligence (IWOBI), 2018, pp. 1–7.
  • [18] S. Yang, A. Wiliem, S. Chen, and B. C. Lovell, “Using lip to gloss over faces in single-stage face detection networks,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 640–656.
  • [19] A. Othman and A. Ross, “Privacy of facial soft biometrics: Suppressing gender but retaining identity,” in European Conference on Computer Vision Workshop.    Springer, 2014, pp. 682–696.
  • [20] T. Sim and L. Zhang, “Controllable face privacy,” in 11th IEEE International Conference on Automatic Face and Gesture Recognition (FG), vol. 4, 2015, pp. 1–8.
  • [21] S. Chhabra, R. Singh, M. Vatsa, and G. Gupta, “Anonymizing k-facial attributes via adversarial perturbations,” arXiv preprint arXiv:1805.09380, 2018.
  • [22] J. Suo, L. Lin, S. Shan, X. Chen, and W. Gao, “High-resolution face fusion for gender conversion,” IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, vol. 41, no. 2, pp. 226–237, 2011.
  • [23] V. Mirjalili, S. Raschka, A. Namboodiri, and A. Ross, “Semi-Adversarial Networks: Convolutional autoencoders for imparting privacy to face images,” in Proceedings of 11th IAPR International Conference on Biometrics (ICB).    Gold Coast, Australia: IEEE, 2018.
  • [24] V. Mirjalili, S. Raschka, and A. Ross, “FlowSAN: Privacy-enhancing semi-adversarial networks to confound arbitrary face-based gender classifiers,” IEEE Access, pp. 1–1, 2019.
  • [25] V. Mirjalili, S. Raschka, and A. Ross, “Gender Privacy: An ensemble of Semi Adversarial Networks for confounding arbitrary gender classifiers,” in Proceedings of 9th IEEE International Conference on Biometrics: Theory, Applications and Systems (BTAS), Los Angeles, CA, 2018.
  • [26] Z. Niu, M. Zhou, L. Wang, X. Gao, and G. Hua, “Ordinal regression with multiple output cnn for age estimation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 4920–4928.
  • [27] W. Cao, V. Mirjalili, and S. Raschka, “Rank-consistent ordinal regression for neural networks,” arXiv preprint arXiv:1901.07884, 2019.
  • [28] M. Günther, A. Rozsa, and T. E. Boult, “AFFACT: alignment free facial attribute classification technique,” arXiv preprint arXiv:1611.06158, 2016.
  • [29] D. Bobeldyk and A. Ross, “Predicting soft biometric attributes from 30 pixels: A case study in NIR ocular images,” in 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW).    IEEE, 2019, pp. 116–124.
  • [30] G. Levi and T. Hassner, “Age and gender classification using convolutional neural networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2015, pp. 34–42.
  • [31] J. Mansanet, A. Albiol, and R. Paredes, “Local deep neural networks for gender recognition,” Pattern Recognition Letters, vol. 70, pp. 80–86, 2016.
  • [32] S. Jia, T. Lansdall-Welfare, and N. Cristianini, “Gender classification by deep learning on millions of weakly labelled images,” in IEEE 16th International Conference on Data Mining Workshops, 2016, pp. 462–467.
  • [33] M. Castrillón-Santana, J. Lorenzo-Navarro, and E. Ramón-Balmaseda, “Descriptors and regions of interest fusion for in-and cross-database gender classification in the wild,” Image and Vision Computing, vol. 57, pp. 15–24, 2017.
  • [34] J.-C. Chen, A. Kumar, R. Ranjan, V. M. Patel, A. Alavi, and R. Chellappa, “A cascaded convolutional neural network for age estimation of unconstrained faces,” in Proceedings of the IEEE Conference on Biometrics Theory, Applications and Systems, 2016, pp. 1–8.
  • [35] S. Chen, C. Zhang, M. Dong, J. Le, and M. Rao, “Using Ranking-CNN for age estimation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 5183–5192.
  • [36] A. Jourabloo, X. Yin, and X. Liu, “Attribute preserved face de-identification,” in International Conference on Biometrics (ICB), 2015, pp. 278–285.
  • [37] R. Gross, L. Sweeney, F. De la Torre, and S. Baker, “Model-based face de-identification,” in Computer Vision and Pattern Recognition Workshop (CVPRW), 2006.
  • [38] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” arXiv preprint arXiv:1312.6199, 2013.
  • [39] W. Oleszkiewicz, P. Kairouz, K. Piczak, R. Rajagopal, and T. Trzciński, “Siamese generative adversarial privatizer for biometric data,” in Computer Vision – ACCV 2018.    Cham: Springer International Publishing, 2019, pp. 482–497.
  • [40] N. Akhtar and A. Mian, “Threat of adversarial attacks on deep learning in computer vision: A survey,” arXiv preprint arXiv:1801.00553, 2018.
  • [41] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” arXiv preprint arXiv:1412.6572, 2014.
  • [42] O. Poursaeed, I. Katsman, B. Gao, and S. Belongie, “Generative adversarial perturbations,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 4422–4431.
  • [43] S.-M. Moosavi-Dezfooli, A. Fawzi, and P. Frossard, “DeepFool: a simple and accurate method to fool deep neural networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2574–2582.
  • [44] Y. Su, G. Sun, W. Fan, X. Lu, and Z. Liu, “Cleaning adversarial perturbations via residual generative network for face verification,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2019, pp. 2597–2601.
  • [45] S. Taheri, M. Salem, and J.-S. Yuan, “Razornet: Adversarial training and noise training on a deep neural network fooled by a shallow neural network,” Big Data and Cognitive Computing, vol. 3, no. 3, 2019. [Online]. Available:
  • [46] A. Graese, A. Rozsa, and T. E. Boult, “Assessing threat of adversarial examples on deep neural networks,” in 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA).    IEEE, 2016, pp. 69–74.
  • [47] A. Ross, S. Banerjee, C. Chen, A. Chowdhury, V. Mirjalili, R. Sharma, T. Swearingen, and S. Yadav, “Some research problems in biometrics: The future beckons,” Proceedings of 12th IAPR International Conference on Biometrics (ICB), 2019.
  • [48] A. Rozsa, M. Günther, E. M. Rudd, and T. E. Boult, “Are facial attributes adversarially robust?” in 2016 23rd International Conference on Pattern Recognition (ICPR), 2016, pp. 3121–3127.
  • [49] Q. Xie, Z. Dai, Y. Du, E. Hovy, and G. Neubig, “Controllable invariance through adversarial feature learning,” in Advances in Neural Information Processing Systems, 2017, pp. 585–596.
  • [50] P. Terhörst, N. Damer, F. Kirchbuchner, and A. Kuijper, “Unsupervised privacy-enhancement of face representations using similarity-sensitive noise transformations,” Applied Intelligence, pp. 1–18, 2019.
  • [51] A. Morales, J. Fierrez, and R. Vera-Rodriguez, “SensitiveNets: Learning agnostic representations with application to face recognition,” arXiv preprint arXiv:1902.00334, 2019.
  • [52] P. C. Roy and V. N. Boddeti, “Mitigating information leakage in image representations: A maximum entropy approach,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 2586–2594.
  • [53] S. Jia, T. Lansdall-Welfare, and N. Cristianini, “Right for the right reason: Training agnostic networks,” in International Symposium on Intelligent Data Analysis.    Springer, 2018, pp. 164–174.
  • [54] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in Neural Information Processing Systems 27.    Curran Associates, Inc., 2014, pp. 2672–2680.
  • [55] Z. Lu, Z. Li, J. Cao, R. He, and Z. Sun, “Recent progress of face image synthesis,” in 4th IAPR Asian Conference on Pattern Recognition (ACPR), Nov 2017, pp. 7–12.
  • [56] S. Raschka and V. Mirjalili, Python Machine Learning, 3rd Ed.    Birmingham, UK: Packt Publishing, 2019.
  • [57]

    P. Isola, J. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” in

    2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017, pp. 5967–5976.
  • [58] Y. Choi, M. Choi, M. Kim, J. Ha, S. Kim, and J. Choo, “StarGAN: Unified generative adversarial networks for multi-domain image-to-image translation,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2018, pp. 8789–8797.
  • [59] J. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in IEEE International Conference on Computer Vision (ICCV), Oct 2017, pp. 2242–2251.
  • [60] T. Kim, M. Cha, H. Kim, J. K. Lee, and J. Kim, “Learning to discover cross-domain relations with generative adversarial networks,” arXiv preprint arXiv:1703.05192, 2017.
  • [61] G. Antipov, M. Baccouche, and J.-L. Dugelay, “Face aging with conditional generative adversarial networks,” in IEEE International Conference on Image Processing (ICIP).    IEEE, 2017, pp. 2089–2093.
  • [62] D. Ulyanov, A. Vedaldi, and V. S. Lempitsky, “Instance normalization: The missing ingredient for fast stylization,” CoRR, vol. abs/1607.08022, 2016.
  • [63] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Computer vision and pattern recognition (CVPR), 2016, pp. 770–778.
  • [64] M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein GAN,” arXiv preprint arXiv:1701.07875, 2017.
  • [65] O. M. Parkhi, A. Vedaldi, and A. Zisserman, “Deep face recognition,” in British Machine Vision Conference, vol. 1, 2015, p. 6.
  • [66] Z. Liu et al., “Deep learning face attributes in the wild,” in Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 3730–3738.
  • [67] K. Ricanek and T. Tesafaye, “MORPH: A longitudinal image database of normal adult age-progression,” in Automatic Face and Gesture Recognition.    IEEE, 2006, pp. 341–345.
  • [68] S. Milborrow, J. Morkel, and F. Nicolls, “The MUCT landmarked face database,” PRASA, 2010.
  • [69] O. Langner, R. Dotsch, G. Bijlstra, D. H. Wigboldus, S. T. Hawk, and A. Van Knippenberg, “Presentation and validation of the radboud faces database,” Cognition and emotion, vol. 24, no. 8, pp. 1377–1388, 2010.
  • [70] S.-Y. Zhang, Zhifei and H. Qi, “Age progression/regression by conditional adversarial autoencoder,” in Computer Vision and Pattern Recognition (CVPR), 2017.
  • [71] F. De la Torre, W.-S. Chu, X. Xiong, F. Vicente, X. Ding, and J. Cohn, “Intraface,” in 11th IEEE International Conference on Automatic Face and Gesture Recognition (FG), vol. 1, 2015, pp. 1–8.
  • [72]

    L. Tran, X. Yin, and X. Liu, “Disentangled representation learning GAN for pose-invariant face recognition,” in

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017.
  • [73] J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in computer Vision and Pattern Recognition (CVPR), 2018, pp. 7132–7141.
  • [74] Q. Cao, L. Shen, W. Xie, O. M. Parkhi, and A. Zisserman, “VGGFace2: A dataset for recognising faces across pose and age,” in International Conference on Automatic Face and Gesture Recognition, 2018.
  • [75] B. DeCann and A. Ross, “Relating ROC and CMC curves via the biometric menagerie,” in Biometrics: Theory, Applications and Systems (BTAS).    IEEE, 2013, pp. 1–8.