Privacy-Preserving Support Vector Machine Computing Using Random Unitary Transformation

08/20/2019 ∙ by Takahiro Maekawa, et al. ∙ 0

A privacy-preserving support vector machine (SVM) computing scheme is proposed in this paper. Cloud computing has been spreading in many fields. However, the cloud computing has some serious issues for end users, such as the unauthorized use of cloud services, data leaks, and privacy being compromised. Accordingly, we consider privacy-preserving SVM computing. We focus on protecting visual information of images by using a random unitary transformation. Some properties of the protected images are discussed. The proposed scheme enables us not only to protect images, but also to have the same performance as that of unprotected images even when using typical kernel functions such as the linear kernel, radial basis function(RBF) kernel and polynomial kernel. Moreover, it can be directly carried out by using well-known SVM algorithms, without preparing any algorithms specialized for secure SVM computing. In an experiment, the proposed scheme is applied to a face-based authentication algorithm with SVM classifiers to confirm the effectiveness.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 4

page 7

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Cloud computing and edge computing have been spreading in many fields with the development of cloud services. However, the computing environment has some serious issues for end users, such as the unauthorized use of cloud services, data leaks, and privacy being compromised due to unreliabile providers and some accidents. A lot of studies on secure, efficient, and flexible communications, storage, and computation have been reported [1, 2, 3, 4, 5, 6]. For securing data, full encryption with provable security (like RSA and AES) is the most secure option. However, many multimedia applications have been seeking a trade-off in security to enable other requirements, e.g., low processing demands, retaining bitstream compliance, and flexible processing in the encrypted domain, so a lot of perceptual encryption schemes have been studied to achieve a trade-off [6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

In recent years, considerable efforts have been made in the fields of fully homomorphic encryption and multi-party computation [16, 17, 18, 19]. However, these schemes can not be applied yet to SVM algorithms, although it is possible to carry out some statistical analysis of categorical and ordinal data. Moreover, the schemes have to prepare algorithms specialized for computing encrypted data.

Because of this, we propose a privacy-preserving SVM computing scheme in this paper . We focus on images protected by using a random unitary transformation, which have been studied as one of methods for cancelable biometrics [20, 21, 22, 23, 24, 25, 26]

, and then consider some properties of the protected images for secure SVM computing, where images mean features extracted from data. As a result, the proposed scheme enables us not only to protect images, but also to have the same performance as that of unprotected images under some useful kernel functions as isotropic stationary kernels. Moreover, it can be directly carried out by using well-known SVM algorithms, without preparing any algorithms specialized for secure SVM computing.

SVM is a typical machine learning algorithm that allows us to use kernel tricks. SVM is used as an example of machine learning algorithms based on the Euclidean distance or the inner product between vectors. It is shown that the proposed scheme enables to maintain the Euclidean distance and the inner product, so the scheme can be also applied to other machine learning algorithms.

In an experiments, the proposed scheme is applied to a face recognition algorithm with SVM classifiers to confirm the effectiveness.

2 Preparation

2.1 Support Vector Machine

support vector machine (SVM) is a supervised machine learning algorithm that can be used for both classification and regression challenges, but it is mostly used in classification problems. In SVM, we input a feature vector to the discriminant function as

where is a weight parameter, and is a bias. SVM also has a technique called the ”kernel trick”, which is a function that takes low dimensional input space and transform it to a higher dimensional space. These functions are called kernels. The kernel trick can be applied to Eq. (2.1) to map an input vector on a further high-dimension feature space and then to linearly classify it on that space as

(1)

The function maps an input vector on high dimensional feature space , where is the number of the dimensions of features. In this case, feature space includes parameter (). The kernel function of two vectors , is defined as

(2)

where is an inner product. There are various kernel functions[27]. For example, the radial basis function (RBF) kernel is given by

(3)

and the polynomial kernel is provided by

(4)

where is a high parameter for deciding the complexity of boundary determination, is a parameter for deciding the degree of the polynomial, and indicates a transpose.

This paper aims to propose a new framework to carry out some machine learning algorithms with protected vectors. SVM is used to demonstrate the effectiveness of the proposed scheme as one of machine learning algorithms.

2.2 Scenario

Figure 1 illustrates the scenario used in this paper. In the enrollment task, client , , prepares training images . Next the client creates protected images by using a secret key and sends them to a cloud server. The server stores them and implements learning with the protected images for a classification problem.

In the authentication task, client creates a protected image as a query and sends it to the server. The server carries out a classification problem with a learning model prepared in advance, and then returns the result to client .

Note that the cloud server has no secret keys and the classification problem can be directly carried out by using a well-known SVM algorithm. In the other words, the server does not have to prepare any algorithms specialized for the classification in the encrypted domain.

Figure 1: Scenario

3 Proposed framework

In this section, protected images generated by using a random unitary matrix are conducted, and a SVM computation scheme with the protected images is proposed under the use of some kernel functions.

3.1 Protection of visual information

Protection schemes of visual information based on unitary transformations have been studied as one method for cancelable biometrics[20, 23, 24, 25, 21, 22]. This paper has been inspired by those studies.

Let us transform an image with pixels into a vector , where is a pixel value of . A vector is protected by a unitary matrix having randomness with a key , as

(5)

where is a protected vector. Various generation schemes of have been studied to design unitary or orthogonal random matrices such as Gram-Schmidt-based methods, random permutation matrices and random phase matrices[25, 24]

. For example, the Gram-Schmidt-based methods are applied to a pseudo-random matrix to generate

. Security analysis of the protection schemes have been also considered in terms of brute-force attacks, diversity and irreversibility.

3.2 SVM with protected images

3.2.1 Properties

Protected images generated according to Eq. (5) have the following properties under [25].

Property 1 : Conservation of Euclidean distances:

(6)

Property 2 : Conservation of inner products:

(7)

Property 3 : Conservation of correlation coefficients:

(8)

where is a vector of another client , who has M training samples .

3.2.2 Classes of kernels

We consider applying encrypted images to the kernel trick. In the case of using the RBF kernel, the following relation is satisfied from property 1 and Eq.(4):

(9)

Therefore, protected images do not have any influence when using kernel functions based on Euclidean distance, such as the RBF kernel.We call the class of these Euclidean distance based kernel functions class 1 in this paper.

In addition, from property 2, we can also use a kernel that depends only on the inner products between two vectors.The polynomial kernel and linear kernel are in this class, referred to as class 2. Therefore, following relations are satisfied, under property 2,

(10)
(11)

3.2.3 Dual problem

Next, we consider binary classification that is the task of classifying the elements of a given set. A dual problem for implementing a SVM classifier with protected images is expressed as

(12)

where and are correct labels for each piece of training data, and are dual variables and C is a regular coefficient. If we use kernel class 1 or class 2 described above, the inner product is equal to . Therefore,even in the case of using protected images, the dual problem with protected images is reduced to the same problem as that of the original images. This conclusion means that the use of the proposed images gives no effect to the performance of the SVM classifier under kernel class 1 and class 2.

3.3 Relation among keys

As shown in Fig 1, a protected image is generated from training image by using a key . Two relations among keys are summarized here.

3.3.1 Key condition 1:

The first key choice is to use a common key for all clients, namely, . In this case, all protected images satisfy the properties described in 3.2, so the SVM classifier has the same performance as that of using the original images.

3.3.2 Key condition 2:

The second key choice is to use a different key for each client, namely . In this case, the three properties are satisfied only among images with a common key. This key condition allows us to enhance the robustness of security against various attacks as discussed later.

Under this key condition 2, we can consider two type spoofing attacks. Fist one is the case that secred keys leak out and an attacker use them. The attacker can try to authorize the system with the leaked key. And another is case that a original images of some client leak out. In this case, the attacker can authorize with the original images by transformed by some key which is created by the attacker as discribed in our experiments.

4 Experimental Results

The propose scheme was applied to facial recognition experiments that were carried out as a dual problem.

4.1 Data Set

We used the Extended Yale Face Database B[26], which consists of frontal facial images with -pixels for people like Fig. 2. It is assumed that there were clients (users), a cloud server, and an attaker (a heinous third party) in this paper. 36 people were used as clients and 64 images for each person were divided into half randomly for training data samples and queries. 1 person was used as an attacker from the database and 32 images of the attacker were used as queries. We used random permutation matrices as an example of unitary matrices to produce protected images, although there are other transformations such as the Gram-Schmidt-based method. It is known that random permutation matrices have an advantage in terms of less computational complexity compared with the Gram-Schmidt-based method[25]. Any unitary transformations with randomness are applicable to the proposed scheme. Besides, the RBF kernel and linear kernel were used, where they belong to kernel class 1 and class 2, respectively. The protection was applied to images with 1216 dimensions generated by the down-sampling method[23]. The down-sampling method divides an image into non-overlapped blocks and then calculates the mean value in each block. Figure 3 shows the examples of an original image and the protected one. Here, the protected image was created by a random permutation matrix which consists of 0 and 1.

(a) person1 (b) person2
Figure 2: Examples of Extended Yale Face Database B
(a) image (b) protected
Figure 3: An example of protection

4.2 Results and Discussion

In facial recognition with SVM classifiers, one classifier is created for each enrollee. The classifier outputs a predicted class label and a classification score for each query image , where is a protected image generated from the image of a query . The classification score is the distance from a query to the boundary range. The relation between the classification score and a threshold for a positive label of is given as

(13)

In the experiment, the false reject rate(FRR), false accept rate(FAR), and equal error rate(EER) at which FAR is equal to FRR, were used to evaluate the performance. As described in 4.1, face images of 37 people including 1 attacker were prepared and there were 64 images for each person. images of 36 people were used for training, and other 1152 images of 36 people and 32 images of the attacker were used as query ones for authentication respectively, under various key conditions.

4.2.1

Figure 4 shows results in the case of using key condition 1. The results demonstrate that SVM classifiers with protected images (protected in Fig. 4) performed the same as SVM classifiers with the original images (not protected in Fig. 4).

In the experiment, when 32 images of person 1 were used as query ones, the FRR value of person 1, , under a value was calculated as follows. The number of images , which were rejected as another person from Eq.(13), was calculated, and then the rate of the rejected images was calculated as . Finally, the average of values over 36 people was obtained as .

The FAR value of person 1, , under a value was calculated as follows. When images without images of person 1 were used as query ones, the number of images , which were accepted as person 1 from Eq.(13), was calculated, and then the rate of the accepted images was calculated as . Finally, the average of values over 36 people was obtained as . From the results, it is confirmed that the proposed scheme gives no effect to the performance of SVM classifiers under key condition 1.

4.2.2

Figure 5 shows results in the case of using key condition 2. In this condition, it is expected that a query will be authenticated only when it meets two requirements, i.e. the same key and the same person, although only the same person is required under key condition1. Therefore, the performances in Fig. 5 were slightly different from those in Fig. 4, so the FAR performances for key condition 2 were better due to the strict requirements.

4.2.3 Unauthorized outflow

Next, it is assumed that a key or an image leaks from a client. An attacker (a heinous third party) may be able to spoofs user with a leaked key or leaked images of person . If an private image leaks, the visual information can not be protected, but the third party’s spoofing attack may be able to be protected by using encrypted images. In this experiment, we evaluated FAR performances when a key or images leaked out and an attacker spoofed person 1 to 36 with the leaked key or the leaked images. When a key leaked, the attacker spoofed a user with the leaked key and images of the attcker. In contrast, when images leaked, the attacker spoofs a user with the leaked images and a key prepared by the attacker. Protected images to spoof a user were generated by the leaked images and the key. When 32 images prepared by the attacker were used as query ones, the FAR value of user 1 was calculated as follows. The number of images , which were accepted as person 1 from Eq.(13) was calculated, and then the rate of the false accepted images was calculated as . Finally, the average of values over 36 users was obtained as .

Figure 6 shows the FAR performance in the case that a key leaked out. In this situation, the attacker could use the key without any authorization as spoofing attacks. ”FAR protected (key leaked, )” indicates FAR values when clients used a same key and the key leaked out. ”FAR protected (key leaked, )” indicates FAR values when each client used a different key and the key leaked out. ”FAR protected (key leaked, )” was better than ”FAR protected (key leaked, )”. Therefore, it is confirmed that the security against the spoof with the leaked key is enhanced, if we can use key condition 2.

Figure 7 is the FAR performance in the case that images of person leaked out. ”FAR protected (image leaked, )” indicates FAR values when clients used a same key and images of person leaked out. ”FAR protected (image leaked, )” indicates FAR values when each client used a different key and the images of person leaked out. As well as Fig. 6, FAR values under the use of different keys were lower than FAR ones under the same key.

From these results, FAR values under the use of different keys are improved not only when authentication is carried out by an enrolled user, but also when an attacker spoofs users using a leaked key or a leaked image. Therefore, the use of key condition 2 enhances the robustness of the security against spoofing attacks.

(a) Linear kernel ()
(b) RBF kernel (, )
Figure 4: FAR and FRR ()
Figure 5: FAR and FRR (RBF kernel, )
Figure 6: FAR with leaked keys (RBF kernel)
Figure 7: FAR with leaked original images (RBF kernel)

5 conclusion

In this paper, we proposed a privacy-preserving SVM computing scheme with protected images. It was shown that images protected by a unitary transform has some useful properties, and the properties allow us to securely compute SVM algorithms without any degradation of the performances. Besides, two key conditions were considered to enhance the robustness of the security against various attacks. Some face-based authentication experiments using SVM classifiers were also demonstrated to experimentally confirm the effectiveness of the proposed scheme.

Acknowledgements

This work was partially supported by Grant-in-Aid for Scientific Research(B), No.17H03267, from the Japan Society for the Promotion Science.

References

  • [1] K. Nakamura, N. Nitta, and N. Babaguchi, “Encryption-free framework of privacy-preserving image recognition for photo-based information services,” IEEE Transactions on Information Forensics and Security, pp.1264–1279, 2019.
  • [2] M. Ra, R. Govindan, and A. Ortega, “P3: Toward privacy-preserving photo sharing.,” USENIX Symposium on Networked Systems Design and Implementation, pp.515–528, 2013.
  • [3] C. T. Huang, L. Huang, Z. Qin, H. Yuan, L. Zhou, V. Varadharajan, and C-C. J. Kuo, “Survey on securing data storage in the cloud,” APSIPA Transactions on Signal and Information Processing, vol.3, 2014.
  • [4] R. Lazzeretti and M. Barni, “Private computing with garbled circuits [applications corner],” IEEE Signal Processing Magazine, vol.30, no.2, pp.123–127, 2013.
  • [5] M. Barni, G. Droandi, and R. Lazzeretti, “Privacy protection in biometric-based recognition systems: A marriage between cryptography and signal processing,” IEEE Signal Processing Magazine, vol.32, no.5, pp.66–76, 2015.
  • [6] R.L. Lagendijk, Z. Erkin, and M. Barni, “Encrypted signal processing for privacy protection: Conveying the utility of homomorphic encryption and multiparty computation,” IEEE Signal Processing Magazine, vol.30, no.1, pp.82–105, 2013.
  • [7] I. Ito and H. Kiya, “One-time key based phase scrambling for phaseonly correlation between visually protected images,” EURASIP J. Information Security, 2010.
  • [8] T. Chuman, K. Kurihara, and H. Kiya, “On the security of block scrambling-based etc systems against jigsaw puzzle solver attacks,” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.2157–2161, 2017.
  • [9] J. Zhou, X. Liu, O. C. Au, and Y. Y. Tang, “Designing an efficient image encryption-then-compression system viapredictionerrorclusteringandrandompermutation,” IEEE transactions on information forensics and security, pp.39–50, 2014.
  • [10] K. Kurihara, S. Shiota, and H. Kiya, “2015 an encryption-then-compression system for jpeg standard,” Picture Coding Symposium (PCS), pp.119–123, 2015.
  • [11] K. Kurihara, M. Kikuchi, S. Imaizumi, S. Shiota, and H. Kiya, “An encryption-then-compression system for jpeg/motion jpeg standard,” IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, pp.2238–2245, 2015.
  • [12] T. Chuman, K. Kurihara, and H. Kiya, “On the security of block scrambling-based etc systems against jigsaw puzzle solver attacks,” 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.2157–2161, 2017.
  • [13] T. Chuman, K. Kurihara, and H. Kiya, “Security evaluation for block scrambling-based etc systems against extended jigsaw puzzle solver attacks,” 2017 IEEE International Conference on Multimedia and Expo (ICME), pp.229–234, 2017.
  • [14] T. Chuman, W. Sirichotedumrong, and H. Kiya, “Encryption-then-compression systems using grayscale-based image encryption for jpeg images,” IEEE Transactions on Information Forensics and security, pp.1–1, 2018.
  • [15] W. Sirichotedumrong and H. Kiya, “Grayscale-based block scrambling image encryption using ycbcr color space for encryption-then-compression systems,” APSIPA Transacrions on Signal and Information Processing, 2019. (Accepted).
  • [16] T. Araki, A. Barak, J. Furukawa, T. Lichter, Y. Lindell, A. Nof, K. Ohara, A. Watzman, and O. Weinstein, “Optimized honest-majority mpc for malicious adversaries - breaking the 1 billion-gate per second barrier,” IEEE Symposium on Security and Privacy (SP), pp.843–862, 2017.
  • [17] T. Araki, J. Furukawa, Y. Lindell, A. Nof, and K. Ohara, “High-throughput semi-honest secure three-party computation with an honest majority,” Proceedings of ACM SIGSAC Conference on Computer and Communications Security, pp.805–817, 2016.
  • [18] W. Lu, S. Kawasaki, and J. Sakuma, “Using fully homomorphic encryption for statistical analysis of categorical, ordinal and numerical data,” IACR Cryptology ePrint Archive, p.1163, 2016.
  • [19]

    Y. Aono and T. Hayashi and L. Phong and L. Wang, “Privacy-preserving logistic regression with distributed data sources via homomorphic encryption,” IEICE Transactions on Information and Systems, vol.E99.D, no.8, pp.2079–2089, 2016.

  • [20] C. Rathgeb, and A. Uhl, “A survey on biometric cryptosystems and cancelable biometrics,” EURASIP J. Information Security, pp.1–25, 2011.
  • [21] K. Nandakumar, A. K. Jain, “Biometric template protection: Bridging the performance gap between theory and practice,” Signal Processing Magazine, IEEE, pp.88–100, 2015.
  • [22] S. Rane, “Standardization of biometric template protection,” Signal Processing Magazine, IEEE, 2014.
  • [23] J. Wright, A. Yang, A. Ganesh, S. Sastry, and Y. Ma, “Robust face recognition via sparse representation,” IEEE Trans. Pattern Analysis and Machine Intelligence, 2009.
  • [24] I. Nakamura, Y. Tonomura, and H. Kiya, “Unitary transform-based template protection and its properties,” European Signal Processing Conference, pp.2466–2470, 2015.
  • [25] I. Nakamura, Y. Tonomura, and H. Kiya, “Unitary transform-based template protection and its application to l2-norm minimization problems,” IEICE Transactions on Information and Systems, pp.60–68, 2016.
  • [26] A.S. Georghiades, P.N. Belhumeur, and D.J. Kriegman, “From few to many: Illumination cone models for face recognition under variable lighting and pose,” IEEE Trans. Pattern Analysis and Machine Intelligence, pp.643–660, 2001.
  • [27] M.G. Genton, “Classes of kernels for machine learning: A statistics perspective,” J. Mach. Learn. Res., vol.2, pp.299–312, 2002.