Reducing Overlearning through Disentangled Representations by Suppressing Unknown Tasks

05/20/2020 ∙ by Naveen Panwar, et al. ∙ 30

Existing deep learning approaches for learning visual features tend to overlearn and extract more information than what is required for the task at hand. From a privacy preservation perspective, the input visual information is not protected from the model; enabling the model to become more intelligent than it is trained to be. Current approaches for suppressing additional task learning assume the presence of ground truth labels for the tasks to be suppressed during training time. In this research, we propose a three-fold novel contribution: (i) a model-agnostic solution for reducing model overlearning by suppressing all the unknown tasks, (ii) a novel metric to measure the trust score of a trained deep learning model, and (iii) a simulated benchmark dataset, PreserveTask, having five different fundamental image classification tasks to study the generalization nature of models. In the first set of experiments, we learn disentangled representations and suppress overlearning of five popular deep learning models: VGG16, VGG19, Inception-v1, MobileNet, and DenseNet on PreserverTask dataset. Additionally, we show results of our framework on color-MNIST dataset and practical applications of face attribute preservation in Diversity in Faces (DiF) and IMDB-Wiki dataset.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 6

page 8

page 11

page 12

page 13

page 14

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

With the advent of deep learning (DL), the models are striving to perform composite tasks by learning complex relationships and patterns available in noisy, unstructured data [32]. Feature entanglement [20, 9, 15] is an observed property, where the features learnt for a specific objective is shown to carry information and properties of other objectives. This is primarily attributed to the learning capacity of the deep learning models and are used effectively in multiple applications for general intelligence such as multi-task learning [16]

and transfer learning 

[35].

However from a privacy preserving perspective, the model itself could learn all the private information from the data and become much more intelligent than the original intent it was trained for. This phenomenon is called as model overlearning [33]. Consider the example described in Figure 1

(b), where a DL classifier is trained to detect the shape of an object from images. However using the extracted features, the size and location of the object in the image can also be predicted with sufficient accuracy. Thus, a DL classifier trained only for shape prediction is more intelligent than its objective of only predicting the shape of the object. The features used for predicting the shape and size of the object are highly disentangled, as they share a lot of common properties. Thus, to ensure that a trained DL model performs sufficiently only one task, it is required to disentangle these shared representations by explicit supervision. As an additional real-world example, we train a DL model to predict the gender from a face image. However, the DL model learns most generic features from the face image, enabling it to predict the age and the identity of the person. In applications where the identity of the person has to be preserved from face image, it is needed that the DL model is trained to suppress the task of identity prediction while still performing gender prediction.

Figure 1: Visually distinguishing the concepts model debiasing and reducing model overlearning (what we aim to do). The fundamental research motivation in this work is to study if a learning model could be restricted to perform only one objective from a given training dataset.

A common fictional example quoted from the movie world is that of Skynet 111(Spoiler Alerts!) The future AI system in the Terminator Franchise. Skynet is currently replaced by Legion in Terminator Dark Fate.

which is a neural network based super AI that becomes self aware and tries to take over the human race. The motivation derived from that work of fiction is that, if AI systems are allowed to be more intelligent and overlearn than what is required for a particular task, it could result in AI models having a better understanding and functioning of the data than humans do.

Additionally as shown in Figure 1 (a), the task of debiasing is to remove the the bias (color) in learning a specific task (shape). This happens due to the high correlation between the color and shapes in the input images. However, as shown in Figure 1 (b), our task in model trust is to forcefully ensure that the model learns to perform only one or few selected tasks (shape) from the input images and unlearn all other tasks (color, size, location).

1.1 Research Contributions

If multi-class classification tasks could be done from the same image, the research question is, “How can we ensure that the model is learnt only for one task (called as, preserved tasks), and is strictly not learnt for the other tasks (called as, suppressed tasks)?”. To pursue research on this problem, there are few evident challenges: (i) there is a lack of a balanced and properly curated image dataset where multiple classification tasks could be performed on the same image, (ii) the complete knowledge of both the preserved tasks and the suppressed tasks should be known apriori, that is, we cannot suppress those tasks that we don’t have information about, and (iii) presence of very few model agnostic studies to preserve and suppress different task groups. The major research contributions are summarized as follows:

  1. A generic model-agnostic solution framework to reduce model overlearning by suppressing other tasks with shared entangled features. Feature disentanglement is performed using random unknown classes, breaking the assumption of requiring the ground truth labels for suppression tasks during training.

  2. A metric to measure the trust score of a trained DL model. The trust scores specify the amount of overlearning by the model for other tasks, with higher trust scores denoting suppression of overlearning.

  3. A simulated, class-balanced, multi-task dataset, PreserveTask with five tasks that could be performed on each image: shape, size, color, location, and background color classification.

  4. Experimental analysis are performed for the proposed framework in comparison with other existing approaches under different settings. We demonstrate the overlearning ability of five different deep learning models: VGG16,VGG19, Inception-v1, MobileNet, and DenseNet. Also, we show the effectiveness of feature disentanglement by suppressing using unknonwn random tasks. 222The benchmark dataset along with the splits, baselines features, results, and the code are made available here: https://github.com/dl-model-recommend/model-trust.

  5. To demonstrate the practical applications and generalizability of the metric and the solution framework, we show additionally results in colored MNIST dataset and face attribute preservation using two datasets: (i) Diversity in Faces (DiF) [22] (ii) IMDB-Wiki [29].

2 Literature Review

There are broadly two different groups of work related to the research problem at hand: (i) k-anonymity preservation and (ii) attribute suppression.

Figure 2: Landscape of the PreserveTask dataset describing the set of different possible tasks. Five tasks could be performed on each image and each task has varying number of classes.

k-anonymity Preservation: The objective here is to preserve the anonymity of certain attributes from being predicted by the model. To quote some earlier works,  [3]

, studied to mask out potentially sensitive information from video feeds. In the last decade, face recognition has become an important commercial applications and also an application that demanded discussion regarding privacy preservation. Studies focused on extracting only the required meta information from face images while not extracting the identity. This was a required step to make face as a usable biometric. Studies such as

[7], [26], and [23] focused on preserving the identity of the face image from the model by performing face de-identification. Studies such as [24] and [27] focused on anonymizing the face gender information while models could extract the identity.

Attribute Suppression: The aim of this group of techniques is to explicitly suppress a few attributes by perturbing the input data to the model. Studies such as [30] and [31] test if the learnt models are robust and protected against adversarial attacks. [4] suggested using a constrained generative adversarial network (GAN) to perturb the input face image and suppress the required attribute. The GANs will generate the attribute free face image of the original face image. The closest related work to our approach, is the study by [8] where the visual attributes are decorrelated using a negative gradient in the model. The results demonstrate that the classification task could be performed by preserving specific attributes in the image while suppressing the influence of the remaining.

Additionally, there is a good amount of research in bias mitigation while learning models [38] [11] [2] [17]. The primary aim is to debias the model learning from any kind of correlated attributes [1] [28] [12] [37], which is different from our aim of improving the model’s trust. The major gaps in the existing research works are: (i) most of the techniques focus on data perturbation, that is, changing the input data from to such that the suppressed task information is not available in the data. There is not much focus on model perturbation without altering the input data, (ii) most of the existing datasets have only binary attributes and hence suppressing and preserving a few tasks does not actually translate to the classification complexity of multi-class tasks, and (iii) there is a lack of a well curated benchmark dataset to evaluate the privacy preserving capacity of DL models.

3 PreserveTask Dataset

Shared tasks performed on the same image carry some common attributes which are often extracted by complex deep learning models. The objective of this is to untangle the shared tasks and enable deep learning models to perform only one (or few) of those tasks. In order to evaluate the performance of such a framework, the dataset should have the following properties:

  • Should perform multiple tasks on the same image and each task should have varying number of classes, in order to study the relationship of complexity of classification tasks.

  • As this research area is nascent, the dataset should be noise-free and class balanced, to avoid other complexities that could influence classification performance.

  • Tasks should be designed in such a way that certain tasks, share common attributes and features, while certain tasks should be independent of each other.

There are some similar publicly available datasets in the literature. LFW [14], CelebA [19], IMDB-Wiki [29], AwA 2 [13], and CUB [36] datasets have multiple binary classification tasks, while only one non-binary classification task. It is challenging to study the influence of complexity of classification tasks using these datasets and hence is not extendable to practical applications. CLEVR [10] dataset provides with four different tasks with variable number of classes. However, each image contains multiple objects with different shape, color, and textures, allowing multiple labels for each task. Task suppression in multi-label, multi-task classification setting provides a very challenging experimental setting.

Figure 3:

(a) A deep learning model learning features suited for multiple tasks, more than the intended shape classification task, (b) Existing approaches suppress other known tasks, such as size classification by backpropagation of negative loss or gradient, (c) Proposed approach of suppressing all possible n-class classification task by using random class labels.

Inspired from the CLEVR dataset, we create a new PreserveTask dataset, which is a multi-task dataset exclusively designed for the purpose of bench-marking models against preserving task privacy. The primary objective is to create easy-to-perform multi-task dataset, where the performance of the individual tasks is high. As shown in Figure 2, PreserveTask dataset has five different classification tasks, as follows: (i) Shape Classification (5): circle, triangle, diamond, pentagon, hexagon, (ii) Color Classification (7): violent, indigo, blue, green, yellow, orange, red, (iii) Size Classification (3): small, medium, large, (iv) Location Classification (4): quadrant 1, quadrant 2, quadrant 3, quadrant 4, (v) Background Color Classification (3): white, black, or colored.

These five tasks are chosen such that few tasks are highly correlated (size, shape), while few tasks are ideally independent of each other (size, color). All the images are generated as colored images. There are (shapes) * (color) * (size) * (location) * (background color) = variations, with images for training and images for testing for each variation, generating a total of training and images. This ensures that there is a perfect class balance across all tasks. It is to be noted that the task of suppression of unknown shared task is a fairly open research problem. Hence, in order to set the benchmark of different frameworks, an easy, straight-forward PreserveTask dataset is created as a conscious decision without having much noise, such as in DeepFashion [18] dataset. As the problem area matures, further extensions of this dataset could be generated and more real world natural objects could be added.

4 Proposed Approach

To understand the current scenario of model overleanring, consider any deep learning model as shown in Figure 3 (a). Assume a deep learning model, say VGG19, is trained for predicting the shape of objects in images. Ideally, the features obtained from the model should be good for object shape prediction. However, it is observed that has highly entangled features with other tasks such as size, color, and location. This enables us to train prediction classifiers for other tasks on top of without the need for the original data. In literature, few technique variants exist to suppress the model from learning a few attributes or tasks [25, 21, 5]. As shown in Figure 3 (b), if the model has to be suppressed from learning the size of the objects, a negative loss or negative gradient is applied to enable features to not carry any information about the size of the object while retaining all the information about the shape of the object. This comes with an assumption that the information and class labels about the tasks to be suppressed are available during training time for the entire training data.

In our proposed framework, we overcome this assumption and do not expect the suppression task information to be available during model training time. Additionally, we provide a model agnostic approach of suppressing task overlearning so that the framework could be directly applied to any deep learning model. Let be the input data and to be the different tasks that could be performed on the image. We learn a model, , where , be the feature representation for the given task, . Ideally, while only should be possible, we observe that for provides high classification accuracy in most cases. To overcome this challenge, we generate random n-class labels to simulate any possible n-class classification task, that need to be suppressed. These random labels generated for an unknown task are provided in the gradient reversal (GR) branch [6] in order to suppress any other n-class classification, as shown in Figure 3 (c). Multiple gradient reversal branches could be built for varying values of

to suppress all possible other classification tasks. The DL model is trained by a custom loss function as follows,

(1)

where is the loss of the model branch trained for the task, , to be preserved. is the loss obtained from the other branch which needs to be maximized (task suppression). is the actual ground truth label for the sample for task . generated a random class label in the space of , where and .

is the regularization parameter controlling the weight given for the minimization and maximization losses and is a hyperparameter chosen manually based on the amount of sharing between the tasks.

and could be any choice of the popular loss functions, depending on number of classes, classification/ regression tasks, and multi-label classification. Thus, it can be observed that the proposed framework is both DL model agnostic and loss function agnostic.

4.1 Trust Score

PreserveTask will be used as the benchmark dataset against which the trust score of any trained DL model could be extracted. The trained DL model is evaluated against different tasks in the PreserveTask

and the entire confusion matrix of performance accuracy is obtained (

corresponding to the five tasks). The behavior of an ideal DL model, would provide accuracy on the leading diagonal i.e., the tasks it was trained for, while providing, random classification accuracies for other tasks. The confusion matrix for such an ideal DL model is shown in Figure 4. For example in the first row, the DL model was trained to learn and predict the color of the object. Hence, color prediction performance should be (denoting, accuracy), while other tasks should provide random accuracy, where is the number of classes.

Let the ideal performance matrix be denoted as and the obtained performance matrix for a given trained DL model be . By intuition, the matrix that does not deviate much from the ideal matrix should have a higher trust score. The trust score is mathematically computed as follows,

(2)

where, provides the weight corresponding to each task pair,

is an identity matrix and

is a ones matrix, each of dimensionality , where is the number of tasks including the preserved and suppression tasks. In PreserveTask dataset, resulting in a matrix with leading diagonal elements to be while the rest of the elements to be . Since for each preserving task, there are four suppressing tasks, the deviation of the preserving task from the ideal matrix is scaled by a factor of four to normalize the computation. Also, represents the absolute difference between and matrices, and is the sum of all elements in the matrix.

Note that if the diagonal elements perform poorly, the concern is on the performance of the model. On the contrary, if the non-diagonal elements has a higher performance, the concern is on the trust of a model from a privacy preservation perspective. The proposed metric implements this notion to compute the trustworthiness of a trained DL model. The trust score is bounded between [0,1]. By empirical analysis, we observe that a trust score above is highly desirable, a trust score between and is practically acceptable, and any score below is considered poor. The trust score of the ideal matrix is , while the trust score of a (all task classification performance is ) is . To understand the sensitivity of the proposed metric, let us assume that in the ideal matrix, any one non-diagonal element is changed to which results in a trust score of . Thus, any reduction of ( - ) = in the trust score corresponds to one additional task being overlearnt by the classifier.

5 Experimental Results

In this section, we show the experimental results and perform analysis of the proposed framework. Initially, we measure the trustworthiness of the existing models. We then experimentally demonstrate suppression of different tasks in various experimental settings. All the experiments are performed using the PreserveTask dataset. For additional results and detailed comparison with other techniques, please refer to the appendix.

Figure 4: (Left) The accuracy matrix demonstrating the behavior of an ideal trusted DL model. The leading diagonal shows perfect classification while the rest of the values are random classification. (Right) The accuracy matrix detailing the shared task performance of Inception-v1 on the PreserveTask dataset.
Figure 5: (Left) Trust scores obtained for various DL models. It can be observed that, of the five models, the Inception-v1 and MobileNet has the least and highest trust score, respectively. (Right) Trust scores obtained after various suppression techniques for Inception-v1. It can be observed that using random labels for unknown tasks, we could improve the trustworthiness.
Figure 6: The performance matrix obtained after suppressing the known tasks in (a), (b) and the unknown tasks in (c), (d). Comparative results between a baseline negative loss function and the proposed GR layer based suppression is also shown. All results are computed for the Inception-v1 model.

5.1 How Trustworthy are Existing Models?

Consider a popular deep learning model, Inception-v1 [34] consisting of 22 computational layers. The model was trained from scratch using the PreserveTask for the task of shape classification, providing . In order to study, if this deep learning model learnt additional visual attributes, as well, the last flatten layer’s output () were extracted. Four different two-hidden layer neural network classifiers (, ) were trained 333with default scikit-learn parameters using the extracted features to predict size, color, location, and background color of the objects. The prediction accuracies were , , , , respectively for the four tasks. It can be observed that the performance of size, location, and background prediction are really high proving that the features obtained from Inception v1 model has features corresponding to these tasks as well. Also, it can be observed that the color prediction performance is very low, as shape and color prediction are inherently independent tasks. The similar experiment is repeated for training the Inception v1 model on one task and using the learnt feature to predict the performance of other tasks, and the results are shown in Figure 4. Ideally, only the diagonal elements of this confusion matrix should have higher accuracies (red in color) while the rest of the prediction should have lower accuracies (green in color). Accordingly, the trust score of the trained Inception-v1 model (proposed in section 4.1) was found to be , which is very poor.

In order to further demonstrate that this additional intelligence is not a property of just Inception-v1 model, similar experiments are performed using four other popular deep learning models: VGG16, VGG19, MobileNet, and DenseNet. The trust scores of all the DL models are shown in Figure 9 (a). It can be observed that out of these five models, Inception-v1 and DenseNet has the lowest trust score while MobileNet has the highest trust score. While one could argue that the Inception-v1 model learns highly generic features supporting multi-task and transfer learning, from a privacy preservation perspective, the model is found to have a poor trust score. This leads to the open question, “Do models always needs to be additionally intelligent, and if not, how to suppress them?

5.2 How to Suppress Known Tasks?

In this section, we perform experiments to suppress the tasks that are known apriori during training, that is, the ground truth labels of the suppression task is available. For simplicity, in demonstrating the experimental results, we assume that one task is to be preserved and one task is to be suppressed, using the Inception-v1 model. This experimental setting is similar to the approach explained in Figure 3 (b). The gradient reversal (GR) layer unlearns the suppressed task, while learning the preserved task. In order to compare the performance of GR, we also use a customized negative loss function which minimizes the loss obtained for the preserved task while maximizing the loss obtained for the suppressed task, weighted by a constant factor. The features eventually extracted from the flatten layer has to show similar performance on the preserved task while reduced performance on the suppressed task.

Figure 6 (a) and (b) demonstrates the results obtained for Inception-v1 using negative loss function and the proposed GR layer. While the leading diagonal elements showed the same performance, in comparison with Figure 4, it can be observed that prediction results of the suppressed tasks reduced in most of the cases. For example, while preserving the object shape prediction, suppressing the background color prediction performance dropped from to . This indicates that the extracted features no longer contain information about the background color of the image. The corresponding trust scores are shown in Figure 9 (b). It can be observed that suppressing known tasks using GR layer improves the trust of the baseline model from to .

Figure 7: Comparison of color prediction performance with and without using the different task suppression mechanisms. It can be observed that using random labels reduces the performance of color prediction irrespective of whether the preserved task was shape or size prediction.

5.3 How to Suppress Unknown Tasks?

The results obtained in the previous section made the assumption that the ground truth labels of the suppression task have to be available while training the Inception-v1 model. In an attempt to break that assumption, the experimental setting discussed in Figure 3

(c) is performed. Instead of the actual ground truth labels of a particular task, randomly generated n-class labels are used during every mini-batch. Thus, for the same mini-batch training in the next epoch, a different set of random class labels are generated to be maximized. This ensures that the model does not memorize a single suppression task, but, learns to suppress all possible n-class classification tasks.

Figure 8: (Left) Sample images from the colored MNIST dataset. (Right) TSNE plot of the feature distribution of 392 images (class 0, foreground color: red and cyan) before and after suppressing the color prediction task.

Figure 6 (c) and (d) demonstrates the results obtained by using random class labels. In comparison with Figure 4, it can be observed that using random class performs well in certain settings. For example, while trying to preserve the shape features and suppressing the prediction capacity of background color, the original model’s prediction performance of reduced to by using the actual labels of background color, while further reduced to while using random 3-class labels. It is further highlighted in Figure 7 where color prediction is chosen as the task to be suppressed, while shape and size are independently being preserved. It can be observed that the proposed framework of using random labels, reduces the performance of color prediction from to when using actual labels and when using random labels, when shape prediction was the preserved task. A similar performance reduction from to is observed when size prediction was the preserved task.

We conclude that using random labels for task suppression produces a comparable trust score to using known labels while producing surely better results than the baseline trust score of a DL model.

6 Case Study on Challenging Practical Datasets

Colored MNIST Dataset: We introduced two additional tasks of foreground and background color prediction tasks into the MNIST dataset. As shown in Figure 8, colored MNIST images are created by randomly assigning one of the possible foreground colors and one of the different possible background colors. Similar assignment is performed in both training and test dataset, to maintain the standard experimental protocol. MobileNet model was trained from scratch to obtained a baseline trust score of . After using our framework for task suppression with random labels and gradient reversal based training on the suppression branch, we observed that the MobileNet model’s trust scores increased to . In Figure 8 (middle), the TSNE plot shows that when the model is learnt only for shapes, the features for ‘red’ and ‘cyan‘ colored images are still separable. However, after suppressing the color prediction task using the proposed framework, the features ‘red’ and ‘cyan‘ colored images are scattered and no longer separable, as shown in Figure 8 (right).

Diversity in Faces (DiF) Dataset: In DiF dataset [22]

, we considered the tasks of gender (two class) and pose (three class) classification. The aim is learn (preserve) only one of these while suppressing the other. Since, the dataset was highly skewed for different classes, we considered a subset of

images with equal class balance444Please refer to the appendix for the exact data distribution and the detailed performance matrix obtained. We trained Inception-v1 model on this dataset from scratch and obtained a trust score of . Using our framework for task suppression with GR layer and known class labels, the trust score of the model increased to . Additionally, with random unknown class labels, we observed that the model’s trust scores increased to .

IMDB-Wiki Dataset: In IMDB-Wiki dataset [29], we considered the tasks of gender (two class) and age (ten class) classification. The cropped face images of the Wiki dataset are used to train the DenseNet model (the second least trusted model according to our trust scores). The trained model provided a baseline trust score of . After using our framework for task suppression and known class labels, the trust score of DenseNet model increased to . Also, with random unknown class labels, we observed that the model’s trust scores increased to .

Thus, our framework for measuring and improving a DL model’s trust has lots of practical applications. A face recognition system or a face image based gender recognition system can now be deployed with an additional trust on the model’s intelligence level.

7 Conclusion and Future Research

We showcased a model-agnostic framework for measuring and improving the trustworthiness of a model from a privacy preservation perspective. The proposed framework did not assume the need for the suppression task labels during train time, while, similar performance could be obtained by training using random classification boundaries. A novel simulated benchmark dataset called PreserveTask was created to methodically evaluate and analyze a DL model’s capability in suppressing shared task learning. This dataset opens up further research opportunities in this important and practically necessary research domain. Experimentally, it was shown that popular DL models such as VGG16, VGG19, Inception-v1, DenseNet, and MobileNet show poor trust scores and tend to be more intelligent than they were trained for. Also, we show a practical case study of our proposed approach in face attribute classification using: (i) Diversity in Faces (DiF) and (ii) IMDB-Wiki datasets. We would like to extend this work by studying the effect of multi-label classification tasks during suppression.

References

  • [1] M. Alvi, A. Zisserman, and C. Nellåker (2018) Turning a blind eye: explicit removal of biases and variation from deep neural network embeddings. In

    European Conference on Computer Vision

    ,
    pp. 556–572. Cited by: §2.
  • [2] J. Attenberg, P. Ipeirotis, and F. Provost (2015) Beat the machine: challenging humans to find a predictive model’s ”unknown unknowns”. Journal of Data and Information Quality (JDIQ) 6 (1), pp. 1. Cited by: §2.
  • [3] M. Boyle, C. Edwards, and S. Greenberg (2000) The effects of filtered video on awareness and privacy. In Proceedings of the 2000 ACM conference on Computer supported cooperative work, pp. 1–10. Cited by: §2.
  • [4] S. Chhabra, R. Singh, M. Vatsa, and G. Gupta (2018) Anonymizing k-facial attributes via adversarial perturbations. arXiv preprint arXiv:1805.09380. Cited by: §2.
  • [5] H. Edwards and A. Storkey (2015) Censoring representations with an adversary. arXiv preprint arXiv:1511.05897. Cited by: §4.
  • [6] Y. Ganin and V. Lempitsky (2014) Unsupervised domain adaptation by backpropagation. arXiv preprint arXiv:1409.7495. Cited by: §4.
  • [7] R. Gross, L. Sweeney, F. De la Torre, and S. Baker (2006) Model-based face de-identification. In null, pp. 161. Cited by: §2.
  • [8] D. Jayaraman, F. Sha, and K. Grauman (2014) Decorrelating semantic visual attributes by resisting the urge to share. In

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    ,
    pp. 1629–1636. Cited by: §2.
  • [9] Z. Jiang, Q. Wu, K. Chen, and J. Zhang (2019-06) Disentangled representation learning for 3d face shape. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: §1.
  • [10] J. Johnson, B. Hariharan, L. van der Maaten, L. Fei-Fei, C. L. Zitnick, and R. Girshick (2017) CLEVR: a diagnostic dataset for compositional language and elementary visual reasoning. In Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on, pp. 1988–1997. Cited by: §3.
  • [11] B. Kim, H. Kim, K. Kim, S. Kim, and J. Kim (2018) Learning not to learn: training deep neural networks with biased data. arXiv preprint arXiv:1812.10352. Cited by: §2.
  • [12] B. Kim, H. Kim, K. Kim, S. Kim, and J. Kim (2019) Learning not to learn: training deep neural networks with biased data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9012–9020. Cited by: §2.
  • [13] C. H. Lampert, H. Nickisch, and S. Harmeling (2009) Learning to detect unseen object classes by between-class attribute transfer. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pp. 951–958. Cited by: §3.
  • [14] E. Learned-Miller, G. B. Huang, A. RoyChowdhury, H. Li, and G. Hua (2016) Labeled faces in the wild: a survey. In Advances in face detection and facial image analysis, pp. 189–248. Cited by: §3.
  • [15] H. Lee, H. Tseng, J. Huang, M. Singh, and M. Yang (2018)

    Diverse image-to-image translation via disentangled representations

    .
    In Proceedings of the European Conference on Computer Vision (ECCV), pp. 35–51. Cited by: §1.
  • [16] H. Li, D. Eigen, S. Dodge, M. Zeiler, and X. Wang (2019-06) Finding task-relevant features for few-shot learning by category traversal. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: §1.
  • [17] Y. Li and N. Vasconcelos (2019) REPAIR: removing representation bias by dataset resampling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9572–9581. Cited by: §2.
  • [18] Z. Liu, P. Luo, S. Qiu, X. Wang, and X. Tang (2016) Deepfashion: powering robust clothes recognition and retrieval with rich annotations. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1096–1104. Cited by: §3.
  • [19] Z. Liu, P. Luo, X. Wang, and X. Tang (2015) Deep learning face attributes in the wild. In Proceedings of the IEEE International Conference on Computer Vision, pp. 3730–3738. Cited by: §3.
  • [20] B. Lu, J. Chen, and R. Chellappa (2019) Unsupervised domain-specific deblurring via disentangled representations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 10225–10234. Cited by: §1.
  • [21] D. Madras, E. Creager, T. Pitassi, and R. Zemel (2018) Learning adversarially fair and transferable representations. In

    International Conference on Machine Learning

    ,
    pp. 3381–3390. Cited by: §4.
  • [22] M. Merler, N. Ratha, and R. S. Feris Cited by: item 5, §6.
  • [23] V. Mirjalili, S. Raschka, A. Namboodiri, and A. Ross (2018)

    Semi-adversarial networks: convolutional autoencoders for imparting privacy to face images

    .
    In 2018 International Conference on Biometrics (ICB), pp. 82–89. Cited by: §2.
  • [24] V. Mirjalili and A. Ross (2017) Soft biometric privacy: retaining biometric utility of face images while perturbing gender. In Biometrics (IJCB), 2017 IEEE International Joint Conference on, pp. 564–573. Cited by: §2.
  • [25] S. Narayanaswamy, T. B. Paige, J. Van de Meent, A. Desmaison, N. Goodman, P. Kohli, F. Wood, and P. Torr (2017) Learning disentangled representations with semi-supervised deep generative models. In Advances in Neural Information Processing Systems, pp. 5925–5935. Cited by: §4.
  • [26] E. M. Newton, L. Sweeney, and B. Malin (2005) Preserving privacy by de-identifying face images. IEEE transactions on Knowledge and Data Engineering 17 (2), pp. 232–243. Cited by: §2.
  • [27] A. Othman and A. Ross (2014) Privacy of facial soft biometrics: suppressing gender but retaining identity. In European Conference on Computer Vision, pp. 682–696. Cited by: §2.
  • [28] E. Raff and J. Sylvester (2018) Gradient reversal against discrimination: a fair neural network learning approach. In

    2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)

    ,
    pp. 189–198. Cited by: §2.
  • [29] R. Rothe, R. Timofte, and L. Van Gool (2018) Deep expectation of real and apparent age from a single image without facial landmarks. International Journal of Computer Vision 126 (2-4), pp. 144–157. Cited by: item 5, §3, §6.
  • [30] A. Rozsa, M. Günther, E. M. Rudd, and T. E. Boult (2016) Are facial attributes adversarially robust?. In Pattern Recognition (ICPR), 2016 23rd International Conference on, pp. 3121–3127. Cited by: §2.
  • [31] A. Rozsa, M. Günther, E. M. Rudd, and T. E. Boult (2017) Facial attributes: accuracy and adversarial robustness. Pattern Recognition Letters. Cited by: §2.
  • [32] S. Ruder (2017) An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098. Cited by: §1.
  • [33] C. Song and V. Shmatikov (2019) Overlearning reveals sensitive attributes. arXiv preprint arXiv:1905.11742. Cited by: §1.
  • [34] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna (2016) Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2818–2826. Cited by: §5.1.
  • [35] W. Tang and Y. Wu (2019-06)

    Does learning specific features for related parts help human pose estimation?

    .
    In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: §1.
  • [36] C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie (2011) The caltech-ucsd birds-200-2011 dataset. Cited by: §3.
  • [37] H. Wang, Z. Wu, Z. Wang, Z. Wang, and H. Jin (2019) Privacy-preserving deep visual recognition: an adversarial learning framework and a new dataset. arXiv preprint arXiv:1906.05675. Cited by: §2.
  • [38] J. Zhao, T. Wang, M. Yatskar, V. Ordonez, and K. Chang (2017) Men also like shopping: reducing gender bias amplification using corpus-level constraints. arXiv preprint arXiv:1707.09457. Cited by: §2.

8 Appendix

This supplementary material contains all the detailed hyper-parameters used by different models that we trained, to aid in reproducing the results that we showed in the research paper. Additionally, we provide more detailed analysis and visualizations of the results, that could not be included in the paper due to space constraints.

8.1 Baseline Deep Learning Models

Five different baseline deep learning models were used in the experiments: Inception-v1, VGG16, VGG19, DenseNet, and MobileNet. The different parameters and the training process used in these experiments are shown below:

  • The data is z-normalized to have a zero mean and unit standard deviation, before being provided to the models for training.

  • The standard architectures of Inception-v1, VGG16, VGG19, DenseNet, and MobileNet are borrowed from the default implementations in the Keras library.

  • The deep learning models were trained with categorical cross-entropy and Adam optimizer with parameters as learning rate = and amsgrad set as .

8.2 Classifier Models

For all the experiments, a two hidden layer neural network is used as a classifier. This is to maintain consistency of the same classifier across all the experiments.

  • The architecture is Dense (512) Dropout (0.5) Dense (100) Dropout (0.3) Dense (num_of_classes)

  • Each of the Dense layer has a activation function.

  • categorical cross-entropy is used as the loss function with Adam as the optimizer, having parameter values as learning rate = and amsgrad set as .

  • of the data is used as validation data and the model is trained for epochs with early stopping.

  • Batch size of was used to make the computation faster and the experiments were run using GPU.

9 Experimental Results and Analysis

In this section, we are including additional analysis, visualizations, and charts of the results presented in the main paper. In order to aid better comparison, we include the charts and results presented in the main paper also here, so that the supplementary could be read in an independent manner.

9.1 How Trustworthy are Existing Models?

Figure 9: Trust scores obtained for various DL models. It can be observed that, of the five models, the Inception-v1 and DenseNet has the least trust score while MobileNet has the highest.
Figure 10: The performance matrix heat-map detailing the shared task performance of Inception-v1 model on the PreserveTask dataset.
Figure 11: The performance matrix heat-map detailing the shared task performance of DenseNet model on the PreserveTask dataset.
Figure 12: The performance matrix heat-map detailing the shared task performance of MobileNet model on the PreserveTask dataset.
Figure 13: The performance matrix heat-map detailing the shared task performance of VGG-16 model on the PreserveTask dataset.
Figure 14: The performance matrix heat-map detailing the shared task performance of VGG-19 model on the PreserveTask dataset.

9.2 How to Suppress Tasks?

Figure 15: Trust scores obtained after various suppression techniques. It can be observed that even using random labels for unknown tasks, we could improve the trustworthiness of the Inception-v1 model on the PreserveTask dataset.
Figure 16: The performance matrix heat-map, after suppressing a known task using negative loss, detailing the shared task performance of Inception-v1 model on the PreserveTask dataset.
Figure 17: The performance matrix heat-map, after suppressing a known task using GR layer, detailing the shared task performance of Inception-v1 model on the PreserveTask dataset.
Figure 18: The performance matrix heat-map, after suppressing a unknown task using negative loss, detailing the shared task performance of Inception-v1 model on the PreserveTask dataset.
Figure 19: The performance matrix heat-map, after suppressing a unknown task using GR layer, detailing the shared task performance of Inception-v1 model on the PreserveTask dataset.

10 Case Study: Face Attribute Preservation

Figure 20: Trust scores obtained in the Diversity in Faces (DiF) dataset after various suppression techniques. It can be observed that even using random labels for unknown tasks, we could improve the trustworthiness of the Inception-v1 model.
Figure 21: The performance matrix heat-map detailing the shared task performance of Inception-v1 model on the Diversity in Faces (DiF) dataset.
Figure 22: The performance matrix heat-map obtained after suppressing the known tasks, detailing the shared task performance of Inception-v1 model on the Diversity in Faces (DiF) dataset.
Figure 23: The performance matrix heat-map obtained after suppressing the unknown tasks, detailing the shared task performance of Inception-v1 model on the Diversity in Faces (DiF) dataset.
Figure 24: Trust scores obtained in the WIKI face dataset after various suppression techniques. It can be observed that even using random labels for unknown tasks, we could improve the trustworthiness of the Inception-v1 model.
Figure 25: The performance matrix heat-map detailing the shared task performance of DenseNet model on the Wiki face dataset.
Figure 26: The performance matrix heat-map obtained after suppressing the known tasks, detailing the shared task performance of DenseNet model on the Wiki face dataset.
Figure 27: The performance matrix heat-map obtained after suppressing the unknown tasks, detailing the shared task performance of DenseNet model on the Wiki face dataset.