DeepAI
Log In Sign Up

A New Dimensionality Reduction Method Based on Hensel's Compression for Privacy Protection in Federated Learning

Differential privacy (DP) is considered a de-facto standard for protecting users' privacy in data analysis, machine, and deep learning. Existing DP-based privacy-preserving training approaches consist of adding noise to the clients' gradients before sharing them with the server. However, implementing DP on the gradient is not efficient as the privacy leakage increases by increasing the synchronization training epochs due to the composition theorem. Recently researchers were able to recover images used in the training dataset using Generative Regression Neural Network (GRNN) even when the gradient was protected by DP. In this paper, we propose two layers of privacy protection approach to overcome the limitations of the existing DP-based approaches. The first layer reduces the dimension of the training dataset based on Hensel's Lemma. We are the first to use Hensel's Lemma for reducing the dimension (i.e., compress) of a dataset. The new dimensionality reduction method allows reducing the dimension of a dataset without losing information since Hensel's Lemma guarantees uniqueness. The second layer applies DP to the compressed dataset generated by the first layer. The proposed approach overcomes the problem of privacy leakage due to composition by applying DP only once before the training; clients train their local model on the privacy-preserving dataset generated by the second layer. Experimental results show that the proposed approach ensures strong privacy protection while achieving good accuracy. The new dimensionality reduction method achieves an accuracy of 97 of the original data size.

READ FULL TEXT VIEW PDF
06/01/2021

Wireless Federated Learning with Limited Communication and Differential Privacy

This paper investigates the role of dimensionality reduction in efficien...
03/20/2021

DataLens: Scalable Privacy Preserving Training via Gradient Compression and Aggregation

Recent success of deep neural networks (DNNs) hinges on the availability...
01/27/2021

Accuracy and Privacy Evaluations of Collaborative Data Analysis

Distributed data analysis without revealing the individual data has rece...
01/31/2021

Gain without Pain: Offsetting DP-injected Nosies Stealthily in Cross-device Federated Learning

Federated Learning (FL) is an emerging paradigm through which decentrali...
07/24/2019

Privacy Parameter Variation Using RAPPOR on a Malware Dataset

Stricter data protection regulations and the poor application of privacy...
05/09/2022

Protecting Data from all Parties: Combining FHE and DP in Federated Learning

This paper tackles the problem of ensuring training data privacy in a fe...

1 Introduction

Federated learning (FL) has received significant interest for its advantages compared to traditional (i.e., centralized) machine learning. In addition to mitigating computational load on the central server, FL allows training a model on large-scale datasets while protecting the users’ privacy. FL is a machine learning technique where multiple clients (e.g., devices or organizations) collaboratively train a model under the supervision of a central server. The clients train the learning model on their local datasets and send the updated gradient to the central server. The server calculates the mean of the received gradients and sends the new value of the global gradient to clients for the next training epoch. This process is repeated until getting the trained model.

Although FL ensures a certain level of privacy by not explicitly sharing the data with the server, an attacker (or the server) could retrieve a client’s training dataset using only the shared gradient Zhao et al. (2019). The problem of privacy protection has been solved using differential privacy (DP) Dwork et al. (2006); Dwork (2008); Ouadrhiri and Abdelhadi (2022)

. Before sending the gradients to the server, clients protect their gradients by adding noise drawn from a probability distribution

Gong et al. (2020); Yin et al. (2021); Wu et al. (2020). However, applying DP at each synchronization epoch degrades the privacy protection due to the composition theorem Dwork and Roth (2014). For example, if a client applies DP, at each synchronization round, with a privacy leakage , then after epochs, the privacy leakage becomes

. Thus, a malicious server or an attacker could learn a tighter estimate of the clients’ gradients.

To control the privacy leakage, authors in Wei et al. (2021); Kim et al. (2021); Asoodeh et al. (2021)

propose an approach for determining the standard deviation of the Gaussian distribution to do not exceed a predefined privacy leakage

after synchronization epochs. Nevertheless, these approaches do not enhance privacy protection, because the determined standard deviation depends on the number of synchronization epochs , and eventually the privacy leakage increases by increasing . There is another category of works proposing to handle the problem of privacy leakage by training the FL model via peer-to-peer communications Cyffers and Bellet (2021); Tran et al. (2021); Li et al. (2021). In these works, the server sends the initialized gradient to a client chosen randomly from all clients. Then, this client updates the received global gradient from the server and sends it to another client, and so on, until the last client sends the updated global gradient to the server. These works ensures strong protection of users’ privacy; however, they are vulnerable to label-flipping and data poisoning attacks Fung et al. (2020); Fang et al. (2020).

Moreover, recently Ren et al. Ren et al. (2021) succeeded to recover the training dataset even when the gradient was protected using DP. The authors generate a fake image input with its corresponding label using a generative regression neural network model (GRNN) and then feed this image to the training model at the server to calculate the fake gradient . Retrieving the original training images is done by training the GRNN model by minimizing the distance between the fake gradient and the true gradient . The authors are based on two main components to complete the training: the resolution of the target image, and

the length of the true gradient vector

.

To overcome the aforementioned challenges, we propose a novel privacy-preserving approach that guarantees strong protection of users’ privacy in FL. Specifically, this method includes two layers for privacy protection:

  • The first layer reduces the dimension of the client’ training dataset, using Hensel’s compression. We are the first to use Hensel’s Lemma (McDonald (1974), p.340) in dimensionality reduction.

  • The second layer implements DP by adding noise to the compressed dataset generated by the first layer. These two layers generate a privacy-preserving dataset used by the client in the local training.

Therefore, the proposed approach hides the two principle components (i.e, the resolution of the target image and the length of the gradient vector) on which attackers based to recover the training dataset. Attackers or the malicious server will not have any visibility on the original private dataset of clients as the training is performed on the compressed noisy dataset. Furthermore, the proposed approach prevents the privacy leakage even though the synchronization training epochs increase. This is because DP is implemented once on the original dataset before starting the training. Thus, this approach solves the problem of privacy leakage due to composition. In summary, the main contributions of this paper are as follows.

  • We propose an image-based data protection approach for protecting the privacy of users in FL. The proposed approach overwhelms the shortcomings of the existing DP-based approaches.

  • We develop a new dimensionality reduction method based on Hensel’s Lemma. Unlike the state-of-art methods, we efficiently reduce the dimension of a dataset without losing information. On the other hand, the newly proposed dimensionality reduction method reduces the computational time and the communication overhead by reducing the size of the training dataset.

  • Experimental results demonstrate that our approach guarantees strong protection of users’ privacy while achieving good accuracy.

2 Proposed method

Figure 1 illustrates the main steps for training an FL model using the proposed approach. Before starting the training, the server sends the learning model architecture as well as the initial global gradient and dimension of the dataset elements to all clients, as shown in ’pre-training step ’. In ’pre-training step ’, each client reduces the dimension of the local dataset elements (i.e., layer ) and implements DP (i.e., layer ) on the compressed dataset generated by the first layer. The pre-training steps and are done once before starting the training to generate the privacy-preserving dataset, which is used later in the training. After the pre-training steps, each client starts the local training and sends the local gradient to the server as illustrated in ’training step ’ and ’training step ’, respectively. In ’training step ’, the server aggregates the clients’ local gradients to update the global gradient. In ’training step ’, the server sends the updated global gradient to clients. The training steps , , , and are repeated until getting the trained learning model.

Figure 1: Training a FL model using the proposed privacy-preserving approach.

2.1 First layer: Dimensionality reduction using Hensel’s compression

The first layer reduces the dimension of the original dataset using Hensel’s compression. Unlike the dimensionality reduction methods proposed in the literature Marill and Green (1963); Whitney (1971); Narendra and Fukunaga (1977); Somol et al. (2004); Chen (2003); Almuallim and Dietterich (1991); Kira and Rendell (1992); Liu and Setiono (1996); Kachouri et al. (2010); Peng et al. (2005), the proposed method allows reducing the dimension of a dataset without losing information. The novelty of this paper is based on the following Hensel’s Lemma (McDonald (1974), p.340).

Lemma 1

Let , there is a unique sequence , , such that the series tends toward . This series is called the Hensel’s decomposition of .

In this approach, we call it Hensel’s compression as we are going in the opposite direction; that is to say, instead of decomposing a number we are combining several numbers into one number. In what follows, we explain our innovation with a use case example.
Given a dataset of images, and each element (i.e., image) of the dataset is a matrix . The approach consists of reducing the dimension of by dividing it into sub-blocks of dimension , such that and where . Thus, we get a new matrix of dimension . Figure 2 illustrates an example of reducing a matrix of dimension to another matrix of dimension : The first sub-figure 2-a presents the original matrix. In the second sub-figure 2-b, we divide the matrix into sub-blocks of dimension (i.e., and ). The last sub-figure 2-c shows the new generated matrix after applying Hensel’s compression. In this example, we have . Applying Hensel’s compression by taking and leads to get a new matrix calculated as follows:

(1)

where represents the element at the first row and first column of the matrix . is calculated based on the sub-block located at the first row and the first column in the sub-figure 2-b. In the same way, we calculate the other elements of the matrix based on the sub-blocks of matrix .

Figure 2: Example of reducing the dimension of a matrix using Hensel’s compression.

2.2 Second layer: Privacy-preserving dataset

The second layer applies DP to the compressed dataset produced by the first layer to generate a privacy-preserving dataset. To be specific, we add noise drawn from the Gaussian distribution that has been proved to satisfy the DP definition Dong et al. (2019), where is the privacy leakage, also known as the privacy budget, and is the sensitivity of the function on which we apply the DP mechanism. The privacy-preserving dataset is generated by adding noise to each image as follows: Assuming a dataset of images, such that each image . Thus, each point of will be perturbed using the following equation:

(2)

where , and is a noise drawn from the Gaussian distribution . The sensitivity is the difference between the maximum and the minimum value of . In our case, as we apply DP after normalizing the dataset. It is important to note that decreasing privacy leakage increases privacy protection. is equivalent to perfect privacy protection.

3 Experiments

The objective of this section is to evaluate the impact of DP and Hensel’s compression on the accuracy and the privacy protection. We developed a learning model, see Figure 3

, composed of two convolutional layers. Each layer is associated with ReLu as an activation function. The second convolutional layer is associated with Dropout Regularization to prevent overfitting. Then, we add three fully connected linear layers with the dimension of the output of the last linear layer is

, which corresponds to the number of classes that we have in our training dataset.

Figure 3: The learning model architecture.

We trained the model described above using different amounts of privacy leakage and different levels of data compression. Figure 4

shows samples of the different versions of the MNIST dataset used in training. Based on the dataset dimension, we divided these experiments into three scenarios:

  • Scenario 1: In this scenario, we train the learning model on the original MNIST dataset where the dimension of each image is . This is equivalent to of the data size.

  • Scenario 2: In this scenario, we train the learning model on the compressed MNIST dataset where the dimension of each image is . This is equivalent to of data size.

    Figure 4: Samples from the different versions of the datasets used in the experiments. Sub-figures a), b), c) show samples from the original MNIST dataset after adding noise of privacy leakage , , , respectively. Sub-figures d), e), f) show samples after Hensel’s compression to and adding noise of privacy leakage , , , respectively. Sub-figures g), h), i) show samples after Hensel’s compression to and adding noise of privacy leakage , , , respectively.
  • Scenario 3: In this scenario, we train the learning model on the compressed MNIST dataset where the dimension of each image is . This is equivalent to of data size.

In each scenario, we evaluated the impact of the privacy leakage on the accuracy. We considered three values of privacy leakage , , . Table 1 illustrates the experiment parameters of each scenario.

Scenario Dimension Data size
Privacy
leakage
Gaussian
variance
Table 1: Training Datasets’ properties.
Figure 5: Scenario : Evaluating the impact of DP (i.e., the first layer) on the accuracy, considering three different values of the privacy leakage: , and . Images dimensions .

Figure 5 illustrates the accuracy of the learning model in the first scenario. Overall, we get a high accuracy by only applying DP of the original MNIST dataset. The accuracy is higher than using the three values of privacy leakage, i.e., , , and . We notice that the accuracy decreases by decreasing the privacy leakage . This is because more noise is added to the images when decreases. Regarding the privacy protection, see sub-figures 4-a), b) and c) in scenario , we can still recognize what the real image contains even after adding large noise to the dataset (i.e., the case of which corresponds to Gaussian noise of variance , see sub-figure 4-c).

Figure 6: Scenario : Evaluating the impact of DP (i.e., the first layer) and dimensionality reduction to (i.e., the second layer) on the accuracy, considering three different values of the privacy leakage: , and .

Figure 6 illustrates the accuracy of the learning model in the second scenario. In this scenario, we applied the two layers of privacy protection (i.e., Hensel’s compression and DP). We get a high accuracy even after increasing the privacy leakage. To be more specific, the learning model gives an accuracy of , and for the privacy leakage , and , respectively. For the privacy leakage , we get an accuracy of . Regarding the privacy leakage, we can see that it is hard to distinguish the content of images, especially for .

Figure 7: Scenario : Evaluating the impact of DP (i.e., the first layer) and dimensionality reduction to (i.e., the second layer) on the accuracy, considering three different values of the privacy leakage: , and .

Figure 7 illustrates the accuracy of the learning model in the third scenario. In this scenario, images are compressed from to . Overall, we get a good accuracy compared to the level of privacy protection achieved. For example, in the first case where the privacy leakage , the learning model achieves an accuracy of while ensuring a perfect privacy protection. An attacker could not distinguish the images’ content even if the attacker succeeds to recover the training dataset. We notice that increasing the privacy leakage , the privacy protection increases while the accuracy decreases, specifically we get an accuracy of , and for , and , respectively.

To conclude, the accuracy and the privacy protection depends on the privacy leakage and level of data compression (i.e., Hensel’s compression). The proposed approach achieves an acceptable or high accuracy while ensuring strong privacy protection. Specifically, this good trade-off is achieved in scenario (i.e., Hensel’s compression to dimension ) for , as well as in scenario (i.e., Hensel’s compresseion to dimension ) for .

It is important to note that of the data size (i.e., Hensel’s compression to dimension ) gives roughly the same accuracy as if the learning model is trained on of the data size. Thus, the proposed dimensionality reduction method not only strengthens privacy protection but also reduces the computational overhead. However, compressing too much the data will hide characteristics of images and hence decreases the accuracy. Thus, looking for the optimal trade-off between the level of data compression and the privacy leakage that guarantees strong privacy protection while achieving a good accuracy is of great importance.

4 Conclusion

In this paper, we propose a two layers privacy-preserving method for FL. The first layer reduces the dimension of the original training dataset based on Hensel’s compression, whereas the second layer applies DP on the compression dataset generated by the first layer. The experimental analysis validates the effectiveness of the proposed approach in protecting users’ privacy while achieving good accuracy. Experimental results show also that the learning model accuracy depends on the dataset compression and the DP privacy leakage .

References

  • Almuallim and Dietterich (1991) Almuallim, H., Dietterich, T.G., 1991.

    Learning with many irrelevant features, in: Proceedings of the Ninth National Conference on Artificial Intelligence - Volume 2, AAAI Press. p. 547–552.

  • Asoodeh et al. (2021) Asoodeh, S., Liao, J., Calmon, F.P., Kosut, O., Sankar, L., 2021. Three variants of differential privacy: Lossless conversion and applications. arXiv:2008.06529.
  • Chen (2003) Chen, X.w., 2003.

    An improved branch and bound algorithm for feature selection 24, 1925–1933.

    doi:10.1016/S0167-8655(03)00020-5.
  • Cyffers and Bellet (2021) Cyffers, E., Bellet, A., 2021. Privacy amplification by decentralization. arXiv:2012.05326.
  • Dong et al. (2019) Dong, J., Roth, A., Su, W.J., 2019. Gaussian differential privacy. arXiv:1905.02383.
  • Dwork (2008) Dwork, C., 2008. Differential privacy: A survey of results, in: Agrawal, M., Du, D., Duan, Z., Li, A. (Eds.), Theory and Applications of Models of Computation, Springer Berlin Heidelberg, Berlin, Heidelberg. pp. 1–19.
  • Dwork et al. (2006) Dwork, C., McSherry, F., Nissim, K., Smith, A., 2006. Calibrating noise to sensitivity in private data analysis, in: Proceedings of the Third Conference on Theory of Cryptography, Springer-Verlag, Berlin, Heidelberg. p. 265–284.
  • Dwork and Roth (2014) Dwork, C., Roth, A., 2014. The algorithmic foundations of differential privacy 9, 211–407. URL: https://doi.org/10.1561/0400000042, doi:10.1561/0400000042.
  • Fang et al. (2020) Fang, M., Cao, X., Jia, J., Gong, N., 2020. Local model poisoning attacks to byzantine-robust federated learning, in: 29th USENIX Security Symposium (USENIX Security 20), USENIX Association. pp. 1605–1622.
  • Fung et al. (2020) Fung, C., Yoon, C.J.M., Beschastnikh, I., 2020. The limitations of federated learning in sybil settings, in: 23rd International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2020), USENIX Association, San Sebastian. pp. 301–316.
  • Gong et al. (2020) Gong, M., Feng, J., Xie, Y., 2020. Privacy-enhanced multi-party deep learning. Neural Networks 121, 484–496. doi:https://doi.org/10.1016/j.neunet.2019.10.001.
  • Kachouri et al. (2010) Kachouri, R., Djemal, K., Maaref, H., 2010. Adaptive feature selection for heterogeneous image databases, in: 2010 2nd International Conference on Image Processing Theory, Tools and Applications, pp. 26–31. doi:10.1109/IPTA.2010.5586751.
  • Kim et al. (2021) Kim, M., Günlü, O., Schaefer, R.F., 2021. Federated learning with local differential privacy: Trade-offs between privacy, utility, and communication, in: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2650–2654. doi:10.1109/ICASSP39728.2021.9413764.
  • Kira and Rendell (1992) Kira, K., Rendell, L.A., 1992. The feature selection problem: Traditional methods and a new algorithm, in: Proceedings of the Tenth National Conference on Artificial Intelligence, AAAI Press. p. 129–134.
  • Li et al. (2021) Li, Y., Zhou, Y., Jolfaei, A., Yu, D., Xu, G., Zheng, X., 2021. Privacy-preserving federated learning framework based on chained secure multiparty computing. IEEE Internet of Things Journal 8, 6178–6186. doi:10.1109/JIOT.2020.3022911.
  • Liu and Setiono (1996) Liu, H., Setiono, R., 1996. Feature selection and classification - a probabilistic wrapper approach, in: in Proceedings of the 9th International Conference on Industrial and Engineering Applications of AI and ES, pp. 419–424.
  • Marill and Green (1963) Marill, T., Green, D., 1963. On the effectiveness of receptors in recognition systems. IEEE Transactions on Information Theory 9, 11–17. doi:10.1109/TIT.1963.1057810.
  • McDonald (1974) McDonald, B.R., 1974. Finite rings with identity / Bernard R. McDonald. M. Dekker New York.
  • Narendra and Fukunaga (1977) Narendra, Fukunaga, 1977. A branch and bound algorithm for feature subset selection. IEEE Transactions on Computers C-26, 917–922. doi:10.1109/TC.1977.1674939.
  • Ouadrhiri and Abdelhadi (2022) Ouadrhiri, A.E., Abdelhadi, A., 2022. Differential privacy for deep and federated learning: A survey. IEEE Access 10, 22359–22380. doi:10.1109/ACCESS.2022.3151670.
  • Peng et al. (2005) Peng, H., Long, F., Ding, C., 2005. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 1226–1238. doi:10.1109/TPAMI.2005.159.
  • Ren et al. (2021) Ren, H., Deng, J., Xie, X., 2021. Grnn: Generative regression neural network – a data leakage attack for federated learning. arXiv:2105.00529.
  • Somol et al. (2004) Somol, P., Pudil, P., Kittler, J., 2004. Fast branch bound algorithms for optimal feature selection. IEEE Transactions on Pattern Analysis and Machine Intelligence 26, 900–912. doi:10.1109/TPAMI.2004.28.
  • Tran et al. (2021) Tran, A.T., Luong, T.D., Karnjana, J., Huynh, V.N., 2021. An efficient approach for privacy preserving decentralized deep learning models based on secure multi-party computation. Neurocomputing 422, 245–262. doi:https://doi.org/10.1016/j.neucom.2020.10.014.
  • Wei et al. (2021) Wei, K., Li, J., Ding, M., Ma, C., Su, H., Zhang, B., Poor, H.V., 2021. User-level privacy-preserving federated learning: Analysis and performance optimization. IEEE Transactions on Mobile Computing , 1–1doi:10.1109/TMC.2021.3056991.
  • Whitney (1971) Whitney, A., 1971. A direct method of nonparametric measurement selection. IEEE Transactions on Computers C-20, 1100–1103. doi:10.1109/T-C.1971.223410.
  • Wu et al. (2020) Wu, H., Chen, C.Y., Wang, L., 2020. A theoretical perspective on differentially private federated multi-task learning. ArXiv abs/2011.07179.
  • Yin et al. (2021) Yin, L., Feng, J., Xun, H., Sun, Z., Cheng, X., 2021. A privacy-preserving federated learning for multiparty data sharing in social iots. IEEE Transactions on Network Science and Engineering , 1–1doi:10.1109/TNSE.2021.3074185.
  • Zhao et al. (2019) Zhao, J., Chen, Y., Zhang, W., 2019. Differential privacy preservation in deep learning: Challenges, opportunities and solutions. IEEE Access 7, 48901–48911. doi:10.1109/ACCESS.2019.2909559.