Adversarial Representation Sharing: A Quantitative and Secure Collaborative Learning Framework

by   Jikun Chen, et al.
Shanghai Jiao Tong University

The performance of deep learning models highly depends on the amount of training data. It is common practice for today's data holders to merge their datasets and train models collaboratively, which yet poses a threat to data privacy. Different from existing methods such as secure multi-party computation (MPC) and federated learning (FL), we find representation learning has unique advantages in collaborative learning due to the lower communication overhead and task-independency. However, data representations face the threat of model inversion attacks. In this article, we formally define the collaborative learning scenario, and quantify data utility and privacy. Then we present ARS, a collaborative learning framework wherein users share representations of data to train models, and add imperceptible adversarial noise to data representations against reconstruction or attribute extraction attacks. By evaluating ARS in different contexts, we demonstrate that our mechanism is effective against model inversion attacks, and achieves a balance between privacy and utility. The ARS framework has wide applicability. First, ARS is valid for various data types, not limited to images. Second, data representations shared by users can be utilized in different tasks. Third, the framework can be easily extended to the vertical data partitioning scenario.


page 7

page 8

page 10


Cronus: Robust and Heterogeneous Collaborative Learning with Black-Box Knowledge Transfer

Collaborative (federated) learning enables multiple parties to train a m...

PRECAD: Privacy-Preserving and Robust Federated Learning via Crypto-Aided Differential Privacy

Federated Learning (FL) allows multiple participating clients to train m...

Defending against Reconstruction Attack in Vertical Federated Learning

Recently researchers have studied input leakage problems in Federated Le...

Privacy Assessment of Federated Learning using Private Personalized Layers

Federated Learning (FL) is a collaborative scheme to train a learning mo...

Beyond Gradients: Exploiting Adversarial Priors in Model Inversion Attacks

Collaborative machine learning settings like federated learning can be s...

FLAP – A Federated Learning Framework for Attribute-based Access Control Policies

Technology advances in areas such as sensors, IoT, and robotics, enable ...

Accuracy and Privacy Evaluations of Collaborative Data Analysis

Distributed data analysis without revealing the individual data has rece...

1. Introduction

Deep learning has made great progress in a variety of fields, such as computer vision, natural language processing and recommendation systems. This impressive success largely attributes to the increasing amount of available computation and datasets

(Goodfellow et al., 2016)

. Companies and institutions require large-scale data to build stronger machine learning systems, whereas data is generally held by distributed parties. Therefore, it is a common practice for multiple parties to share data and train deep learning models collaboratively

(Ohrimenko et al., 2016). In most of collaborative training scenarios, users provide their local data to cloud computing services or share data with others, which brings privacy concerns.

Due to security considerations and privacy protection regulations, it is inappropriate to exchange data among different organizations. Obviously, sharing raw data directly may cause a leakage of private and sensitive information contained in datasets. For instance, if some hospitals hope to integrate their patients’ information to establish models for disease detection, they must carefully protect the identity of patients from being obtained and abused by any partner or possible eavesdropper. The problem of ”data islands” requires a privacy-preserving collaborative framework.

Generally speaking, privacy-preserving collaborative learning can be grouped under two approaches. Earlier works focus on MPC, which ensures that multiple users can jointly calculate a certain function while keeping their inputs secret without the trusted third party. In a joint learning framework, data in local devices are encrypted by cryptographic tools before being shared (Mohassel and Zhang, 2017; Agrawal et al., 2019). Cryptographic methods contain garbled circuits (GC) (Yao, 1986), secret sharing (SS) (Paillier, 1999), homomorphic encryption (HE) (Gentry, 2009), etc. However, current cryptographic approaches can just perform several types of operations, and only propose friendly alternatives to some of non-linear functions (Mohassel and Zhang, 2017). Moreover, encryption often causes high computation and communication cost.

Shokri et al. (Shokri and Shmatikov, 2015)

proposed a distributed stochastic gradient descent algorithm to replace the sharing training data framework. The method is now well known as federated learning (FL)

(Hard et al., 2018) and has a wide application in practice. In FL, a cloud server builds a global deep learning model. For each training iteration, the server randomly sends the model to a part of client devices. The clients then optimize the model locally, and send the updates back to the server to aggregate them. Only parameters of the model are communicated, while the training data is retained by the local device, which ensures the privacy. Some recent works combine federated learning with other information security mechanisms (e.g., differential privacy) to further improve privacy (Geyer et al., 2017). However, the communication cost between each local device and the central server is high. After each iteration of training process, each user needs to keep their local deep learning model synchronized.

Different from the above approaches, we consider representation learning (Bengio et al., 2013)

to solve this problem. The idea is inspired by deep neural networks, which embed inputs into real feature vectors (representations). Containing high-level features of the original data, latent representations is efficient to various downstream machine learning tasks

(Goodfellow et al., 2016). The motivation is that both MPC and FL conduct collaborative learning with limited task-applicability. Once the machine learning task changes, the entire training process needs to be executed again, which incurs high communication cost. Comparatively speaking, data representations are task-independent and thus has unique potential in joint learning. Some recent works have studied privacy-preserving data representations (Xiao et al., 2020; Hitaj et al., 2017), but few of them gave further discussion on the collaborative learning scenario. The primary problem in privacy representations learning field is to defend against model inversion attacks(Mahendran and Vedaldi, 2015; He et al., 2019), which aims to train inverse models to reconstruct original inputs or extract private attributes from shared data representations.

In this article, we propose ARS (for Adversarial Representation Sharing), a decentralized collaborative learning framework based on data representations. Our work contains two levels: (i) a collaborative learning framework wherein users share data representations instead of raw data for further training; and (ii) an imperceptible adversarial noise added to shared data representations to defend against model inversion attacks. ARS helps joint learning participants ”encode” their data locally, then add adversarial noise to representations before sharing them. The published adversarial latent representation can defend against reconstruction attacks, thereby ensuring privacy.

Owning to the good qualities of latent representations, ARS has wide applicability. First, ARS is valid for various data types as training samples, not limited to images. Second, ARS is task-independent. Shared data representations are reusable for various tasks. Third, prior joint learning frameworks are commonly designed under scenarios of horizontal data partitioning (in which datasets of users share the same feature space but differ in sample ID space), whereas ARS can easily extended its framework to the vertical data partitioning scenario (in which datasets of different users share the same sample Ids but differ in feature columns) (Yang et al., 2019).

Based on the collaborative learning framework, we apply adversarial example noise (Goodfellow et al., 2015) to protect shared representations from model inversion attacks. The intuition is that adding special-designed small perturbations on shared data representations can confuse the adversaries so that they cannot reconstruct the original data or particular private attributes from the obfuscated latent representations. By simulating the behavior of attackers, we generate adversarial noise for potential inverse models. The noise is added to data representations before sharing them, in order to make it hard to recover the original inputs. In the meantime, the scale of these perturbations are too small to influence data utility. We propose defense strategies against reconstruction attacks and attribute extraction attacks respectively.

The main contributions of our work are summarized as follows:

  • We propose ARS, a new paradigm for collaborative framework which is based on representation learning. Different from MPC and FL, ARS is decentralized and has wide applicability.

  • We introduce adversarial noise to defend against model inversion attacks. To the best of our knowledge, we are the first to apply adversarial examples to ensure privacy in collaborative learning.

  • We evaluate our mechanism on multiple datasets and aspects. The results demonstrate that ARS achieves a balance between privacy and utility. We further discuss the limitations and challenges of our work.

The remainder of the paper is organized as follows. We first review related work in Section 2. Then we introduce how ARS achieves collaborative learning from an overall perspective in Section 3, and detail how adversarial noise is applied to defend against model inversion attacks and ensure privacy in Section 4. Experimental results are shown in Section 5. In Section 6, we discuss the details and challenges of the work. Finally, we conclude the work in Section 7.

2. Related Work

2.1. Privacy-Preserving Representation Learning

To avoid privacy leakage in collaborative learning, some previous works focus on learning privacy-preserving representations (Xiao et al., 2020; Ferdowsi et al., 2020)

. Latent representations retain the abstract features of data, which can be used for further analysis like classification or regression. Generally, the distribution of data representations can be learned by unsupervised latent variable model, such as autoencoders

(Ng et al., 2011). Meanwhile, the original information of data can not be directly inferred from representations. For example, in natural language processing, words are transformed into vectors by embedding networks, which is called word2vec (Church, 2017). Without embedding networks, it is difficult to recover original words from embedding vectors.

However, data representations are still vulnerable to model inversion attacks (He et al., 2019)

. Adversaries can build reconstruction networks to recover original data or reveal some attributes of data from shared representations, even though they have no knowledge of the structure or parameters of the feature extraction models

(Mahendran and Vedaldi, 2015; He et al., 2019)

. For example, they can recover face samples, or infer gender, age and other personal information from shared representations of face images, which were only supposed to be used for training face recognition models.

In order to defend against inversion attacks, recent works focus on adversarial training (Xiao et al., 2020)

or generative adversarial networks (GANs)

(Hitaj et al., 2017). Attackers’ behaviors are simulated by another neural network while learning privacy representations of data, and the two networks compete against each other to improve the robustness of representations. However, since it’s hard to achieve a balance between the attacker models and defender models during training (Salimans et al., 2016), these methods may cost much time in the pretreating phase. Ferdowsi et al. (Ferdowsi et al., 2020) generate privacy representations by producing sparse codemaps. The above defense methods could be applied to collaborative learning, but the authors didn’t give a further discussion on this scenario. In addition, all these methods are task-oriented. When the task of shared data changes, they need to generate new task-oriented data representation again.

2.2. Adversarial Examples

Adversarial examples are perturbed inputs designed to fool machine learning models (Goodfellow et al., 2015). Formally, we denote by

a classifier. For an input

and a label , we call a vector an adversarial noise if it satisfies:

where is a small hyper-parameter to adjust the scale of noise.

Adversarial examples have strong transferability. Some works (Liu et al., 2017) have shown that adversarial examples generated for a model can often confuse another model. This property is used to execute transferability based attacks (Papernot et al., 2017). Even if an attacker has no knowledge about the details of a target model, it can still craft adversarial examples successfully by attacking a substitute model. Therefore, adversarial examples have become a significant threat to machine learning models (Goodfellow et al., 2015; Samangouei et al., 2018).

Except for treating adversarial examples as threats, some works utilize the properties of adversarial examples to protect user’s privacy (Sharif et al., 2016). In this work, we also use adversarial noise to defend against machine learning based inferring attacks. To the best of our knowledge, we are the first to apply adversarial examples to data sharing mechanisms for collaborative learning.

Figure 1. An overview of our basic encoding-based privacy-preserving data sharing mechanism.

3. Overview of ARS framework

ARS achieves collaborative learning by helping users share data representations instead of raw data to train models. In this section, we first present a joint learning scenario, and propose standards for evaluating the effectiveness of a framework. Then we introduce how ARS works in the whole collaborative learning process. Specifically, how participants encoder their data into latent representations, how to share data representations with others, and how shared data is used for various machine learning tasks.

3.1. Collaborative Learning Scenario

Consider the scenario where parties share local data for collaborative training. Note that data belonging to distributed parties can be partitioned horizontally (which means datasets shares same feature space but differ in sample ID space) or vertically (which means datasets shares different same feature space while sample ids are same). For simplicity, we assume data is partitioned horizontally (extension to vertical data partitioning in Section 4.4). The dataset of the -th party is represented as , where is a pair of training sample and corresponding label, and is the number of samples. The goal of each participant is to encode their data into representations , and finally train deep learning models on which is published by all the data owners.

To privacy concerns, participants have to defend against model inversion attacks that reconstruct original inputs from latent representations. Furthermore, there might be some sensitive features or attributes in training samples. Any label of a training sample can become the machine learning object or a private attribute, depending on the particular user and tasks. For example, when building a disease detection model, the illness of each patient can be a public label, while the identity information such as gender, age should be private. We denote by the number of private attributes that the -th party has, and denote the set of labels of -th private attributes by , where . Attackers can also conduct attribute extraction attacks to recover these private attributes.

We highlight the following notations which are important to our discussion.

  • : original training set of user , and label set depending on deep learning tasks.

  • : the -th sample and corresponding label of user .

  • : value (label) of the -th private attribute of .

  • : feature extractor which encode inputs into latent representations.

  • : parameters of the model .

  • : latent representation of the -th sample and label of user , which satisfies .

  • : downstream deep learning model which takes as the training set.

  • : (theoretical) inverse model of , which maps representations to reconstruction of inputs.

  • : decoder trained by adversaries to conduct model inversion attacks, aiming to reconstruct inputs from shared data representations.

3.2. Quantitative Criteria of ARS Mechanism

Formally, given a sample set and a label set , the main goal of data sharing in ARS is to design a mapping that has ”nice” properties. These properties consists of utility and privacy of shared data representations as follows.

3.2.1. Utility

After obtaining the shared data, each user can use dataset to train downstream deep learning model , such as classifiers to predict the label from . Data utility requires the accuracy to approach the results of models trained directly on . Therefore, and should be optimized by minimizing the following expectation:


which ensures that encoding data into representations will not lose much data utility.

3.2.2. Privacy

Privacy characterizes the difficulty of finding a model to recover raw data, such as visualization information of inputs of picture type. Without loss of generality, we focus on reconstruction attacks (attribute extraction attacks is discussed in Section 4.4). Therefore, ARS minimizes privacy leakage , which can be defined as reconstruction loss:


where distance function is used to describe the similarity of original and reconstructed samples. In practice, the distance function is often defined as norm. In computer vision, distance between two images can also be measured as PSNR or SSIM (Hore and Ziou, 2010).

If we consider attribute extraction attacks, we similarly use feature loss to indicate how possible an attacker can predict private features successfully. Note that we don’t adopt the error between the true value of private attributes and prediction results of attackers, in case adversaries break the defense by just flipping their results. Instead, we calculate the distance between the prediction result and a fixed vector. The feature loss corresponding to the -th privacy attribute is defined as:


where is a fixed vector generated randomly, whose size is the same as . is a corresponding attribute extraction network trained by adversaries. Low implies meaningless prediction results. In this condition, the encoding system minimizes the overall generalization loss:


where . The value of is assigned depending on users’ privacy requirements.

3.3. The ARS Collaborative Learning Framework

We apply autoencoder (Ng et al., 2011), an unsupervised neural network to learn data representations. Autoencoder can be divided into an encoder part and a decoder part. The encoder transforms inputs into latent representations, while the decoder map representations to reconstruction of inputs with size smaller than inputs. The optimization target of the autoencoder is to minimize the difference between original inputs and reconstructed ones. Since the encoder compress the feature space, latent representations are wished to remain high-level features of input data (Erhan et al., 2010).

As shown in Fig. 1, the ARS collaborative learning framework consists of three phases:

  • Encoder publishing phase. In this phase, a common encoder is trained and published to all participants. The user having permission to train is called initiator, who can be selected randomly, or the party owning the most amount of data. The mechanism in which users shares the same encoder is to ensure that each column of representation vectors shared by any user expresses a specific meaning. Suppose that user is chosen, it first trains an autoencoder on its local dataset and then publish the encoder part (), as:

  • Data sharing phase. After the common encoder is published, user encode its data into representations as: . Then it shares the pair with the other parties. (In fact, users generate noise for each data representation, then publish instead of to defend against model inversion attacks. See Section 4).

  • Collaborative learning phase. Finally, each participant can use the pairs to train deep learning models locally. For example, model can be trained by minimize the empirical risk:


    where is the amount of pairs shared by all users, and

    is the loss function of model


The ARS framework provides data utility from two aspects. First, accuracy of the deep learning task (i.e. prediction of label ) is ensured. Equation (1) can be explained as: for any sampled from the input distribution and its ground truth label , we aim to maximize to make it close to (which should be slightly less than ). The goal is equivalent to minimize the generalization error of :


Empirically, models with less empirical risk (training error) normally has less generalization error as well. ARS reduces by decreasing in the collaborative learning phase. If dataset is representative and is well trained, the joint learning framework will not lose much accuracy.

Second, shared data representations are adaptive to various tasks. Since is trained without any supervision or knowledge about collaborative learning tasks, we have indicating that latent representations are task-independent. Meanwhile, data representations are valid for downstream tasks, which can be explained by a basic idea that learning about the input distribution helps with learning about the mapping from inputs to outputs (Goodfellow et al., 2016)

. A well trained autoencoder remains high level features of inputs, which are also useful for supervised learning tasks.

Figure 2. Model inversion attack to reconstruct samples.
Figure 3. Adding adversarial noise on latent representations.

4. Adversarial Noise Mechanism Against Inversion Attacks

This section discusses the privacy of ARS. We first define a threat model caused by model inversion attacks. Then we describe how adversarial noise protects data representations from the inversion attacks. Finally, we extend ARS to the scenarios of vertical data partitioning and attribute extraction attacks.

4.1. Threat Model

The main threat of ARS collaborative learning is model inversion attacks. Although private information of raw data is hidden into representations, it may still be recovered by reconstruction models, like trained by initiator in the encoder publishing phase. For simplicity, we firstly focus on reconstruction attacks, and extend our method to overcome attribute extraction attacks in Section 4.4.

The threat model defines adversaries who act like curious participants and try to recover the original input from data representation for private information. Like other participants, adversaries can obtain data representations shared by users and have query access to the published encoder , but have no knowledge about the architecture and parameters of . Nonetheless, adversaries can still exploit query feedback of to carry out black-box attacks and acquire substitute decoders of . For instance, if user with data is an attacker, it can generate latent representations by querying for times. Then it can build a substitute decoder as Figure 3 shows. can be optimized by minimizing the distance between original and recovered samples:


This kind of attack can be regarded as chosen-plaintext attack (CPA) from the cryptographic point of view. The reconstruction ability of model inversion attacks is strong when training samples of victims and attackers has similar distribution. In the following, we propose the adversarial noise mechanism to defend against the inversion attacks.

4.2. Adding Adversarial Noise

The strategy to defense against inversion attacks comes from a simple idea: adding an intentionally designed small noise on latent representations before sharing them. On the one hand, the scale of the noise vector is so small that it would not reduce the utility of shared data representations. On the other hand, we hope adding noise on data representations can make data reconstructed by different enough from the original inputs. Inspired by adversarial examples, we let users add adversarial noise to latent representations in set , and share the set of adversarial examples instead, as shown in Figure 3.

According to some researches (Papernot et al., 2017), adversarial examples have strong transferability. Empirically, if successfully fools the decoder from which it is generated, then it is likely to cause another decoder with similar decision boundary to recover samples that are very different from the original inputs. The properties of adversarial noise determine its ability to protect the privacy of shared data representations even if the scale of noise is small.

The method to generate effective adversarial noise consists of two steps. A participant first trains a substitute decoder locally by simulating reconstruction attacks; then it generates adversarial noise for to make invalid. Adversarial noise is generated through iterative fast gradient sign method (I-FGSM) (Goodfellow et al., 2015), which sets the direction of adversarial noise to the gradient of objective function with respect to . The adversarial latent representation is calculated as:


where is a hyper-parameter regulating the scale of noise in each iteration and is the iteration time. The adversarial noise on can be denoted as:


In consideration of data utility, the difference between and should not be so great, otherwise would lose most of the features of . Therefore, given an encoded vector , we must ensure that , where is a hyper-parameter to be chosen. Next we prove that adversarial noise in Equation (9) satisfies the privacy budget.

Proposition 1 ().

Given , . Suppose , then adversarial data representation defined by Equation (9) satisfies: .


For any iteration step , we have:

Therefore, . ∎

Proposition 1 shows that if we set to in Equation (9), then the scale of adversarial noise will be limited to . Here is called defense intensity, which determines the utility and privacy of representations.

4.3. Adding Masked Adversarial Noise

can mislead most of decoders trained from pairs due to the transferability of adversarial examples. However, model inversion can still occur if attackers apply adversarial training (Goodfellow et al., 2015) (also called data enhancement) to build . Suppose an attacker (user ) aims to execute a reconstruction attack. In the data sharing phase, the attacker first trains on its local data by optimizing Equation (8). Then it transforms into representations and adding adversarial noise to them as Equation (9) shows. After that, the attacker can train on as:


To defend against model inversion attack with data enhancement, we propose noise masking, a simple and effective method to make participants perturb data representations in unique ways. Each user possess a mask vector (denoted by ) with the same size as data representations. All dimensions of are randomly initialized to either or , in order to mask some dimensions of calculated gradients and thus allow users to perturb other dimensions of representations. The process of generating masked adversarial noise on representation is expressed as:


The value of a vector

is held by its owner secretly, which leads attackers to train substitute models on data representations that are perturbed in a significant different way from representations of target users with high probability. We discuss whether and when the masks are effective in Section

6.1, and prove that mask vectors with sufficiently large dimension are difficult to crack through brute force. Therefore, ARS with noise masking can be considered safe enough. Algorithm 1 shows the overall process of generating data representations in ARS.

Input: training samples
Output: representations
1 initialize , , , ;
2 , where ;
3 update via: ;
4 for  to  do
5       ;
6       for  to  do
7             ;
9       end for
11 end for
12return ;
Algorithm 1 ARS Data Sharing Method for User

4.4. Extension

4.4.1. Attribute Exaction Attacks

In addition to reconstruction attacks, user acting as an adversary can also train a classifier on its local pairs , as:


where is the index of the target private attribute. After that, the attacker can conduct attribute extraction attacks by predicting the -th private attribute from shared data representations.

The strategy to overcome the feature leakage is similar to the above method. To preserve the -th private attribute, users should first train a classifier locally, and then craft adversarial noise on to maximize the feature loss . Here we simply apply FGSM method as:


where is the adversarial noise to preserve the -th private attribute. The purpose is to keep prediction results close to a certain vector , regardless of inputs, thus lead the prediction to meaningless results.

Consequently, the overall adversarial noise of an arbitrary participant can be calculated as:


where , so that . Experimental results in Section 5.3 show that letting , may be a good choice to ensure the defense against data leakage and feature leakage at the same time.

4.4.2. Vertical Data Partitioning

In the second extension, we generalize ARS to the vertical data partitioning scenario. For simplicity, we suppose that datasets are aligned on IDs, and each participant owns samples. In the training phase, user trains on its local dataset , then calculate and share . One of the users share labels . Different from horizontal data partition, a common encoder is unused, so does not need to be shared.

To train downstream models collaboratively, data representations shared by each users are concatenated as . Afterwards, any user who has sufficient computational power is able to train model on locally.

5. Experiments

In this section, we evaluate ARS by simulating a multi-party collaborative learning scenario. We present the performance of ARS in privacy preserving, as well as compare it with other joint learning frameworks, and then study the effectiveness of our mechanism in protecting private attributes.

5.1. Experiment Settings

5.1.1. Datasets

The experiments are conducted on three datasets: MNIST

(LeCun, 1998) and CelebA (Liu et al., 2014) for horizontal data partitioning, and Adult (Dua and Graff, 2017) from UCI Machine Learning Repository for vertical data partitioning. MNIST consists of 70000 handwritten digits, the size of each image is . CelebA is a face dataset with more than 200K images, each with 40 binary attributes. Each image is resized to . Adult dataset is census information, and the given task is to determine whether a person makes over $50,000 a year basing on 13 attributes including ”age”, ”workclass”, ”education”. In our experiments, the inputs are mapped into real vectors with 133 feature columns.

5.1.2. Scenario

We simulate horizontal data partitioning scenarios where the number of participants in MNIST, and in CelebA. Each participant is randomly assigned 10000 examples as a local dataset. When conducting experiments on MNIST, the common encoder and each user’s substitute decoder

are implemented by three-layer ReLU-based fully connected neural networks. On CelebA,


are implemented by four-layer convolutional neural networks. In the data sharing phase, the iteration time

is set to 10. We suppose that attackers execute adversarial training attacks (data enhancement) for two types of objectives: to recover original samples (see Section 5.2), or to extract private attributes (see Section 5.3). In the collaborative learning phase, the tasks are set as training classifiers on shared data representations . The labels are 10-dimensional one-hot codes in MNIST, and 2-dimensional vectors corresponding to binary attributes in CelebA.

We also present a vertical data partitioning scenario on Adult dataset in Section 5.2.4. Each user holds aligned samples, while the number of column vectors owned by different users is kept as close as possible. The encoders and substitute decoders are implemented by four-layer ReLU-based fully connected neural networks. An adversary owning partial inputs would like to attack the user who possess the same feature columns of samples.

All above tasks can be regarded as case studies of collaborative learning in the real world. For example, companies can share privacy representations of photos to train face recognition models; enterprises may cooperate with each other to draw more comprehensive customer personas.

5.1.3. Privacy Metrics

We consider three metrics to measure , which reflects the privacy leakage caused by reconstruction attacks. According to a common practice that represents as norm, we define as

. This metric is also well known as MSE (Mean Square Error). PSNR (Peak Signal-to-Noise Ratio) and SSIM (Structural Similarity Index)

(Hore and Ziou, 2010) reflects pixel level difference between original and reconstructed images, and are highly consistent with human perceptual capability, so they are also applicable for evaluating privacy leakage. The metrics are widely adopted in related researches, which allows us to compare their results with ours directly. To measure the privacy of private attributes, we simulate attacks on these attributes and record the proportion that the predictions are equal to the fixed vector.

(a) MNIST, Accuray
(c) CelebA, Accuray
(d) CelebA, MSE
Figure 4. Classification accuracy and reconstruction loss versus .
Figure 5. Digit images and corresponding reconstructed images. (a) Original input images. (b) Reconstructed images corresponding to data representations without noise. (c) Reconstructed images corresponding to representations with adversarial noise ().
(a) Original Images
(d) , no mask
Figure 6. Face images and corresponding reconstructed images. Images in column (a) are raw data. Column (b), (c), (e), (f) corresponds to adversarial latent representations with different . Column (d) corresponds to adversarial representations () without noise masking.

5.2. Protecting Privacy Against Reconstruction Attacks

5.2.1. Defense Intensity

We firstly evaluate the utility and privacy of ARS with different defense intensity . Figure 4

reports the accuracy of the classifier trained on shared data representations to evaluate the utility, and MSE of reconstructed images to evaluate privacy. Experiments are conducted on both MNIST and CelebA datasets. On CelebA, the task is to predict the attribute ”Male”. We set up another method as a baseline, which generates random noise with uniform distribution instead of adversarial noise. Observe that with the increase of

, the reconstruction loss becomes higher, which indicates that adversarial noise with a larger scale makes it more difficult to filch private information from shared representations. When measuring the utility of shard data, we find that when becomes larger, the accuracy decreases slightly from 97.2% to 89.3% in MNIST, and from 93.7% to 84.3% in CelebA. The variety of MSE and accuracy with the scale of noise illustrates the trade-off between utility and privacy of shared data.

5.2.2. Visualization of Reconstructed Images

We then explore the effectiveness of adversarial noise defense by displaying the images under reconstruction attacks. Figure 5 compares digit images recovered from adversarial latent representations with the undefended version, and illustrates that can well ensure data privacy. This preliminarily proves the privacy of the adversarial noise mechanism.

We further study the influence of different in CelebA. Figure 6 shows the reconstructed images corresponding to data representations adding several kinds of noise. If adversarial noise is not used, the reconstructed image restores almost all private information of faces. When is set to 50, the recovered faces lose most of the features used to determine identity. When , the reconstructed images become completely unrecognizable. For further discussion, we present the result when , while the adversarial noise is not masked. As shown in the column (d) of Figure 6, the faces do get blurry, but some features with private information are still retained.

The experiments present satisfactory performance of adversarial noise on latent representations. If defense intensity is set to a sufficiently small value, the shared data can maintain high utility and privacy. In a real application, data utility is expected to be as higher as possible while privacy is well preserved. We choose as a suitable defense intensity in both datasets in the following experiments, because of its high privacy and acceptable classification accuracy, which is 92.8% in MNIST, and 88.3% in CelebA.

Framework Basic Method Reported Accuracy on MNIST Communication Cost per Server
SecureML MPC 93.4%
DPFL Federated Learning ()
ARS Representation Learning 92.8% ()
Table 1. Comparison of ARS and mainstream methods on important metrics.

5.2.3. Comparison with Existing Mechanisms

We first evaluate data utility by comparing ARS with mainstream joint learning frameworks on MNIST. Table 1 summarizes the performance of these methods in terms of important metrics such as classification accuracy and communication cost. SecureML (Mohassel and Zhang, 2017) is a two-party computation (2PC) collaborative learning protocol based on secret sharing and garbled circuits. Differential private federated learning (Geyer et al., 2017) (which we call DPFL) approximates the aggregation results with a randomized mechanism to protect datasets against differential attacks. As shown in the table, the classification accuracy of ARS reaches a level similar to that of MPC and FL based methods, when we regard as a compromise between utility and privacy. Note that ARS is designed for general scenarios rather than specific datasets or tasks, and better results can be achieved by using more complex models or fine tuning hyper-parameters. Moreover, ARS has a low communication cost. We denote by the average number of training samples owned by each party, by the dimension of the original data, by the batch size, by

the number of epochs to train models, by

the number of parameters in a model, and by the dimension of latent representations. Since the iteration times to train a deep learning model using stochastic gradient descent (SGD) method depends on the number of training samples, otherwise the model will be underfitted, the complexity of should not be lower than , it is easy to prove that ARS has the lowest communication complexity among the three methods.

We next evaluate the privacy of ARS by simulating reconstruction attacks. Since PSNR and SSIM are widely adopted by the latest researches in this field, we also calculate these two metrics as privacy leakage, and compare ARS with two state-of-the-art data sharing mechanisms: generative adversarial training based sharing mechanism (Xiao et al., 2020) and SCA based sharing mechanism (Ferdowsi et al., 2020). Similar to ARS, both of them learn representations of data. The experiment is conducted on CelebA. Table 2 reports the privacy leakage of the mechanisms. As we can see, ARS performs better than the other two frameworks in PSNR, even when . When increases to 100, SSIM of ARS also reaches the best result of the three mechanisms.

(Xiao et al., 2020) (Ferdowsi et al., 2020)
PSNR 9.932 5.748 15.527 15.445 12.31
SSIM 0.531 0.101 0.728 0.300 0.25
Table 2. Privacy of different representation based mechanisms.

5.2.4. Vertical Data Partitioning

We next evaluate the performance of ARS on Adult dataset under the vertical data partitioning scenario. In Table 3, we divide the totally 133 feature columns of samples as evenly as possible among three users, then present the prediction accuracy (Acc),

-score, and robustness to reconstruction attacks of the collaborative learning system with different numbers of participants. To confirm the effectiveness of the inversion attack, we show the accuracy of adversarial training based reconstruction from data representations without adversarial noise (Adv. Tr.). Here we suppose that encoders used to generate data representations can be obtained by adversaries. To estimate the privacy of ARS, we present the reconstruction accuracy on concatenated adversarial latent representations (Rec. Acc). We observe that large number of collaborating users leads to higher prediction accuracy and

-score of the downstream classification model, and when the noise scales up, the reconstruction accuracy drops to less than . So we demonstrate the great utility and privacy of ARS under the vertical data partitioning scenario.

K=1 K=2 K=3
Acc -score Adv. Tr. Rec. Acc Acc -score Adv. Tr. Rec. Acc Acc -score Adv. Tr. Rec. Acc
0 78.9% 87.2% 97.4% 97.3% 82.2% 88.7% 97.6% 97.3% 84.6% 89.9% 97.7% 97.5%
10 78.9% 87.1% 97.2% 77.5% 81.6% 88.1% 97.4% 82.8% 83.8% 89.3% 97.2% 84.7%
25 78.5% 86.8% 96.7% 45.4% 81.2% 87.7% 93.4% 61.3% 83.6% 89.5% 96.7% 55.9%
50 78.5% 86.9% 92.8% 32.6% 81.5% 88.2% 91.7% 46.2% 83.5% 89.6% 93.6% 36.1%
Table 3. Results with different numbers of data owners on the vertical data partitioning scenario.

5.3. Preserving Privacy of Attributes

In this section, we evaluate ARS on a stronger assumption that users have some private attributes to protect. We assess the effectiveness of defense against attribute extraction attacks by how close the extracted features are to a fixed vector given by users. The experiments are conducted on CelebA. For all participants, we set predicting attribute ”High Cheekbones” as the collaborative learning task, while selecting ”Male” and ”Smiling” as private attributes. Then we let each user train attribute extraction network corresponding to the -th private attribute, which is similar to the adv-training decoder attack. We choose some typical values of to generate adversarial noise, and set the fixed vector since the outputs of classifiers are two-dimensional vectors. Then we record the proportion that the predicted private attribute is equal to to estimate the ability of ARS to mislead attribute extraction models.

(a) ,
(b) ,
(d) ,
(e) , ,
(f) ,
Figure 7. Proportion that the predictions of private attributes are equal to a given fixed vector.

We analyze the effect of different compositions of adversarial noise by changing as illustrated in Figure 7. (a-d) show that with the increase of , the probability that predictions of the -th attribute are equal to becomes higher. Note that sometimes the equating rate gets lower when increases to , this may be caused by influence of the other components of adversarial noise. (e-f) demonstrate that the weight of a noise has a great effect on the privacy of the -th attribute. Our further experiment shows that the accuracy of attack classifiers can be close to when .

We further show that the accuracy of classifiers for preserved attributes is close to . Figure 8 presents the accuracy of classifiers for three attributes, two of which are private. With the increase of , the accuracy corresponding to the -th attribute approaches if . In addition, when is set to , the classification accuracy of the second private attribute ”Smiling” has a significant drop even if , this again proves the trade-off between utility and privacy of shared data.In summary, ARS is shown to be effective against attribute extraction attacks with acceptable privacy budget .

(a) ,
(b) ,
(d) ,
(e) , ,
(f) , ,
Figure 8. Classification accuracy on three attributes, with variable value of and . ”Male” and ”Smiling” are private attributes.
22.617 15.274 8.812 7.456 6.720
22.492 18.363 12.033 10.261 9.522
22.497 19.408 14.473 11.976 10.671
22.482 20.827 19.109 15.004 13.179
Table 4. PSNR of different composition of noise and .

We next evaluate the reconstruction error under the same scenario. As we can see in Table 4, larger and lead to greater defense against reconstruction attacks. If we consider an acceptable privacy leakage since it is smaller than the results of similar representation sharing works (Xiao et al., 2020) and (Ferdowsi et al., 2020) we compared in Section 5.2.3, then is a good choice to defend against reconstruction and attribute extraction attacks at the same time.

5.4. Task-Independence Study

Another advantage of ARS mechanism is that the encoder publishing phase is independent of the collaborative learning phase. Most of the up-to-date joint learning frameworks are task-oriented. For example, in the FederatedAveraging algorithm, a server builds a global model according to a specific task, and communicate the parameters with clients. For prior data sharing frameworks, the networks to extract latent representations of data are usually trained with the task-oriented models (such as classifiers). The accuracy of models trained from these representations is regarded as a part of optimization objective functions of feature extracting networks. Task-dependence causes low data utilization. If parties in an existing collaborative learning framework have a new deep learning task, they have to build another framework and train feature extracting networks once again, which results in a heavy cost to transform data and train deep learning models.

In the ARS mechanism, apart from private attributes that are protected, the representation generating process is independent of deep learning tasks. We set up a series of experiments on CelebA to demonstrate this property. We select five of the forty attributes in CelebA and set a binary classification for each attribute, none of them are private to users. In the collaborative learning phase, participants train five classification networks on the latent representations. Each classifier corresponding to an attribute. Since the MSE loss is only related to the encoder, but not to deep learning tasks, we only focus on classification accuracy. Table 5 shows the results with various scale of noise.

Heavy Makeup 86.4% 85.7% 85.0% 87.5% 85.9%
High Cheekbones 80.6% 84.2% 80.1% 82.8% 82.1%
Male 91.7% 91.7% 90.0% 89.3% 88.3%
Smiling 88.0% 84.7% 87.9% 87.5% 84.3%
Wearing Lipstick 90.2% 80.6% 86.7% 85.0% 82.8%
Table 5. Classification accuracy in different tasks. For each of the five attributes from CelebA dataset, we train classifiers to predict whether the label is positive or negative from generated by the same public encoder.

As shown in the results, classification accuracy is higher than 73%, which indicates acceptable correctness. Only the accuracy of predicting the attribute ”Mouth Slightly Open” is lower than 80% with the increase of . The results demonstrate that ARS is task-independent. If there are demands for data sharing, the initiator can just train the public autoencoder, without considering how data representations would be used. For participants, the only remarkable thing is to determine their private attributes, and generate adversarial noise to preserve these attributes. This property prevents the utility of latent representations from being limited to specific tasks, and ensures the robustness of shared data. Representations generated by users independently can achieve good performance in various tasks if the value of hyper-parameter is chosen well. Consequently, ARS can preserve privacy during the data sharing process, while maintaining the utility of data in collaborative learning.

6. Discussion

6.1. Discussions on Noise Masking

We focus on the security of noise masking mechanism by studying whether it can defend against brute-force searching attacks. An attacker can randomly enumerate several mask vectors, train inverse models on representations with these mask vectors respectively and take the vector that performs best in reconstructing others’ data as a good approximation of the victim’s mask. We explore experimentally the relationship between the reconstruction loss and the overlapping rate of masks held by attackers and defenders, which equals to the Hamming distance of the mask vectors divided by their dimension. The experiment is conducted on MNIST, with settings stated in Section 5. As Table 6 illustrates, a higher overlapping rate leads to a higher risk of privacy leakage. So we’ll next study the overlapping of masks.

Overlapping Rate 0% 25% 50% 75% 100%
0.119 0.101 0.082 0.048 0.021
0.179 0.156 0.128 0.09 0.025
Table 6. Reconstruction loss with various overlapping rate of masks held by attackers and defenders.

For any -dimensional mask vectors and , we denote the Hamming distance between them as , and define the overlapping rate between and as . Then we have


which means that .

Suppose is a real number such that , then from the De Moivre-Laplace theorem, the probability that and have bits different is:


Therefore, if the dimension is large enough, the probability that the overlapping rate of two random -dimensional vectors is larger than approaches to for . Moreover, we consider as an acceptable privacy budget for preserving information of data. That is to say, an attack is considered successful if the overlapping rate of masks held by the attacker and user should be greater than a real number , where . When the dimension of latent representations is large enough, the privacy of users’ data can be guaranteed. For example, the dimension of latent representations is . If we accept 75% as overlapping rate, then we have , which means that the privacy of data can be considered well preserved by mask mechanism.

6.2. Future Work

Although ARS shows good performance in our given scenarios, this mechanism still has some limitations. In the horizontal data partitioning scenario, all users generate data representations through the same feature extraction network, which is called the common encoder. This requires the selected initiator to have a sufficient amount of representative data. In practice, however, participants in joint learning may lack enough training samples, or the data of each user may not be independent and identically distributed (non-IID) (Kairouz et al., 2021). This leads the common encoder to be overfitted, so that it will no longer be valid for all users. To cope with this problem, we tried to let the parties train their own encoders on the local datasets, while ensuring that the latent representations have the same distribution. For example, each user applies variational autoencoder (VAE) (Kingma and Welling, 2013)

to constrain data representations to the standard normal distribution. Nevertheless, it is difficult to guarantee that the same dimension of representations generated by different encoders expresses the same semantic. Therefore, aggregation of the shared data representations will no longer make sense.

In this study, we relax the hypothesis of the data owners, requiring at least one party has sufficient training samples, and the samples of each participant have identical distribution. This assumption in accordance with the realistic B2C (business to customer) settings, where the initiator can be an enterprise with a certain accumulation of data. It can initiate data sharing with individual users and provide pre-trained feature extraction models to them. Future work will be dedicated to collaborative learning on non-IID data, and we believe domain adaptation of data representations is a viable solution to this problem.

We also introduce task-independence of the shared data, and how this property can help to reduce communication cost. In this work, data representations are extracted by unsupervised autoencoders to avoid task-orientation, yet some prior knowledge such as the available labels of data is underutilized. A possible area of future work is multi-task learning (Zhang and Yang, 2021), where the given tasks or training labels can be made full use of and contribute to each client’s different local problems. Related studies may bring higher utility of data to task-independent collaborative learning.

7. Conclusion

In this work, we propose ARS, a privacy-preserving collaborative learning framework. Users share representations of data to train downstream models. Adversarial noise is used to protect shared data from model inversion attacks. We evaluate our mechanism and demonstrate that adding masked adversarial noise on latent representations has a great effect in defending against reconstruction and attribute extraction attacks, while maintaining almost the same utility as MPC and FL based training. Compared with some prior data sharing mechanisms, ARS outperforms them in privacy preservation. Besides, ARS is task-independent, and requires no centralized control. Our work can be applied to collaborative learning scenarios, and provides a new idea on the research of data sharing and joint learning frameworks.


  • (1)
  • Agrawal et al. (2019) Nitin Agrawal, Ali Shahin Shamsabadi, Matt J Kusner, and Adrià Gascón. 2019. QUOTIENT: two-party secure neural network training and prediction. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. 1231–1247.
  • Bengio et al. (2013) Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence 35, 8 (2013), 1798–1828.
  • Church (2017) Kenneth Ward Church. 2017. Word2Vec. Natural Language Engineering 23, 1 (2017), 155–162.
  • Dua and Graff (2017) Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository.
  • Erhan et al. (2010) Dumitru Erhan, Aaron Courville, Yoshua Bengio, and Pascal Vincent. 2010. Why does unsupervised pre-training help deep learning?. In

    Proceedings of the thirteenth international conference on artificial intelligence and statistics

    . JMLR Workshop and Conference Proceedings, 201–208.
  • Ferdowsi et al. (2020) Sohrab Ferdowsi, Behrooz Razeghi, Taras Holotyak, Flavio P. Calmon, and Slava Voloshynovskiy. 2020. Privacy-Preserving Image Sharing via Sparsifying Layers on Convolutional Groups. ICASSP (2020).
  • Gentry (2009) Craig Gentry. 2009. Fully homomorphic encryption using ideal lattices. In

    Proceedings of the forty-first annual ACM symposium on Theory of computing

    . 169–178.
  • Geyer et al. (2017) Robin C. Geyer, Tassilo J. Klein, and Moin Nabi. 2017. Differentially Private Federated Learning: A Client Level Perspective. arXiv preprint arXiv:1712.07557 (2017).
  • Goodfellow et al. (2016) Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep learning. MIT press.
  • Goodfellow et al. (2015) Ian Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and harnessing adversarial examples. ICLR (2015).
  • Hard et al. (2018) Andrew Hard, Kanishka Rao, Rajiv Mathews, Swaroop Ramaswamy, Françoise Beaufays, Sean Augenstein, Hubert Eichner, Chloé Kiddon, and Daniel Ramage. 2018. Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604 (2018).
  • He et al. (2019) Zecheng He, Tianwei Zhang, and Ruby B Lee. 2019. Model inversion attacks against collaborative inference. In Proceedings of the 35th Annual Computer Security Applications Conference. 148–162.
  • Hitaj et al. (2017) Briland Hitaj, Giuseppe Ateniese, and Fernando Perez-Cruz. 2017. Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 603–618.
  • Hore and Ziou (2010) Alain Hore and Djemel Ziou. 2010. Image quality metrics: PSNR vs. SSIM. In

    2010 20th international conference on pattern recognition

    . IEEE, 2366–2369.
  • Kairouz et al. (2021) Peter Kairouz, H Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, et al. 2021. Advances and open problems in federated learning. Foundations and Trends® in Machine Learning 14, 1–2 (2021), 1–210.
  • Kingma and Welling (2013) Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).
  • LeCun (1998) Yann LeCun. 1998. The mnist database of handwritten digits.
  • Liu et al. (2017) Yanpei Liu, Xinyun Chen, Chang Liu, and Dawn Song. 2017. Delving into Transferable Adversarial Examples and Black-box Attacks. In ICLR 2017 : International Conference on Learning Representations 2017.
  • Liu et al. (2014) Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2014. Deep Learning Face Attributes in the Wild. 2015 IEEE International Conference on Computer Vision (ICCV) (2014), 3730–3738.
  • Mahendran and Vedaldi (2015) Aravindh Mahendran and Andrea Vedaldi. 2015. Understanding deep image representations by inverting them. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5188–5196.
  • Mohassel and Zhang (2017) Payman Mohassel and Yupeng Zhang. 2017. Secureml: A system for scalable privacy-preserving machine learning. In 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 19–38.
  • Ng et al. (2011) Andrew Ng et al. 2011. Sparse autoencoder. CS294A Lecture notes 72, 2011 (2011), 1–19.
  • Ohrimenko et al. (2016) Olga Ohrimenko, Felix Schuster, Cédric Fournet, Aastha Mehta, Sebastian Nowozin, Kapil Vaswani, and Manuel Costa. 2016. Oblivious multi-party machine learning on trusted processors. In 25th USENIX Security Symposium (USENIX Security 16). 619–636.
  • Paillier (1999) Pascal Paillier. 1999. Public-key cryptosystems based on composite degree residuosity classes. In International Conference on the Theory and Applications of Cryptographic Techniques. Springer, 223–238.
  • Papernot et al. (2017) Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z Berkay Celik, and Ananthram Swami. 2017. Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia conference on computer and communications security. ACM, 506–519.
  • Salimans et al. (2016) Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved techniques for training gans. In Advances in neural information processing systems. 2234–2242.
  • Samangouei et al. (2018) Pouya Samangouei, Maya Kabkab, and Rama Chellappa. 2018. Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models. In ICLR 2018 : International Conference on Learning Representations 2018.
  • Sharif et al. (2016) Mahmood Sharif, Sruti Bhagavatula, Lujo Bauer, and Michael K. Reiter. 2016. Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. 1528–1540.
  • Shokri and Shmatikov (2015) Reza Shokri and Vitaly Shmatikov. 2015. Privacy-preserving deep learning. In Proceedings of the 22nd ACM SIGSAC conference on computer and communications security. ACM, 1310–1321.
  • Xiao et al. (2020) Taihong Xiao, Yi-Hsuan Tsai, Kihyuk Sohn, Manmohan Chandraker, and Ming-Hsuan Yang. 2020. Adversarial Learning of Privacy-Preserving and Task-Oriented Representations. AAAI (2020).
  • Yang et al. (2019) Qiang Yang, Yang Liu, Tianjian Chen, and Yongxin Tong. 2019. Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology (TIST) 10, 2 (2019), 1–19.
  • Yao (1986) Andrew Chi-Chih Yao. 1986. How to generate and exchange secrets. In 27th Annual Symposium on Foundations of Computer Science (sfcs 1986). IEEE, 162–167.
  • Zhang and Yang (2021) Yu Zhang and Qiang Yang. 2021. A survey on multi-task learning. IEEE Transactions on Knowledge and Data Engineering (2021).