I Introduction
With the rise of machine learningasaservice (MLaaS) [1,2] online pretrained models are increasingly exploited in applications like facial recognition [3,4]. The accuracy of trained facial recognition models has been one and only priority in the past decade. In recent years however, preserving the privacy of people whose personal photo has been used as training data became a major priority. In the context of these services, an imminent threat is that, the attacker can reconstruct a recognizable image of a person, given the name of corresponding person and the confidence score returned by API. In this case, breaching the sensitive, confidential or protected data is no longer considered as a financial risk, but it may lead to life threatening danger for victims. In a reconstruction attack recently introduced by Fredrikson et al. [5], adversarial access to a trained facial recognition model is abused to reconstruct the face of an individual given the corresponding name or identifier via gradient descent optimization. However, obtained results by Fredrickson et al. [5] showed that, a reasonable rounding level can completely obviate the threat of gradient descent based attacks. Unfortunately, this trivial countermeasure could not guarantee the privacy of personal images when it comes to deal with more sophisticated attacks. In this paper, we show that, the attacker can impose a significant threat against face recognition systems by taking advantage of optimization techniques like random search which are not dependent to gradient descent. We disclose the imminent threat of face composition parts as it helps the attacker to form a state space of faces to search a composite face with maximum confidence score. As a result, rounding the confidence score as countermeasure can not stop the attacker and training data would no longer remain private. The strength of proposed attack is that, rather than trying to reach an identical reconstructed face, it tries to synthesize an instance which resembles the target face. This policy helps the attacker to reach an approximation of a difficult problem by a nearby problem that is easier to solve. Our observations show that, this type of attack needs more sophisticated countermeasures. To tackle this formidable threat, we propose Face Detection Score Filtering (FDSF) as an effective countermeasure. The main idea of FDSF is to return high confidence score to face which get low Face Detection Score (FDS). The intuition behind FDSF is that, the composite faces get lower FDS comparing to real faces. Returning high confidence score to low quality faces can fool the attacker to form a wrong search path. The experimental results show that, the proposed attack cannot be nullified by rounding policy as countermeasure. Besides, the obtained reconstructed faces significantly resemble the target faces in terms of 17 face characteristics measured by an independent face recognition system. Our contributions are as follows.

We show the potential of composite faces to impose a significant threat against privacy of face recognition systems.

We prove that, random search can challenge the efficiency of rounding policy as countermeasure.

We propose Face Detection Score Filtering (FDSF) as an efficient countermeasure against proposed face reconstruction attack.
The rest of the paper is organized as follows. Section II reviews the related works. Section III demonstrates how composite faces can be used to threat the privacy of face recognition systems . Section IV explains the proposed countermeasure. Section V presents the experimental results and finally section VI concludes the paper.
Ii Related Work
Attackers may select different approaches to violate the privacy of training data and get access to sensitive user data or key information about the model architecture. These approaches are categorized to model inversion and membership inference. In this section, we review researches related to each category.
Iia Model Inversion attacks
Model inversion attacks are designed to reveal sensitive information about the training data used in training phase. Given the class label, the attacker attempts to create an input that maximizes the corresponding confidence score of the target class. In case of a facial recognition system, such attacks are called reconstruction attack in which, the attacker can produce an approximate image of one of the participants whose image was used in the training phase given the name or identifier. In general, image reconstruction attacks fall into two categories: optimizationbased and trainingbased.
IiA1 Optimizationbased Reconstruction Attack
Optimizationbased data reconstruction has a long history in Machine Learning [7]. Lee et al. [6] attacked a multilayer feedforward mapping network using the gradient of Lyapunov function and solving inverse mapping of a continuous function. Since the mapping from the output space to the input space is a one to many mapping, it’s considered as an illposed problem. To address this problem, Lu et al. [8] formulated the inverse problem as a nonlinear programming (NLP) problem. Mahendran et al. [9] proposed a general framework to reconstruct an image
from its computer vision features such as HOG [10] and SIFT [11]. Fredrikson et al. [12] proposed the first endtoend case study of differential privacy in a medical application based on gradient maximization of the class score. Fredrikson et al. [5] took advantage of Denoising Autoencoders (DAE) and sharpening filters as the prior in the reconstruction attack. Mahendran et al. [13] utilized a regularized energy minimization framework and “natural preimages” to reconstruct an image from its representation.
IiA2 Trainingbased Reconstruction attack
Given a target model
training based model inversion attacks try to train a new neural network
to approximate the mapping between confidence scores and input images. Dosovitskiy et al. [14] trained convolutional networks to reconstruct images from different shallow representations including HOG, SIFT and LBP. Dosovitskiy et al. [15] used perceptual similarity of images using extracted deep features using autoencoder, variational autoencoder and deep convolutional networks. Nash et al. [16] used flexible autoregressive neural density for the inversion of supervised representations. Yang et al. [17] used an auxiliary training data for the sake of model inversion, which is considered as a resembling version to the original training data. They also showed that, partial predictions obtained from target training data can be used to construct a comprehensive inversion model.
IiB Membership Inference Attack
Given the name or identification of a person, the goal of membership attack is to reveal whether or not the information of this person has been used as training data. For example, inference attacks against a cancer diagnosis system can be exploited by the adversary to identify if a specific person is a cancer diagnosed case. One of the first researches tried to show the eminence of inference attacks was proposed by Homer et al. [18]. Backes et al. [19] studied the viability of membership inference to challenge the privacy of individuals contributing their microRNA expressions to collect training datasets. Homer et al. [20] focused on attacks on genomic research studies, where an attacker tries to infer the membership of a specific person’s data within an aggregate genomic dataset, or aggregate locations [21]. Shokri et al. [22] took advantage of multiple “shadow models” that approximate the behavior of the target model by training the attack model on the labeled inputs and outputs of the shadow models. Hayes et al. [23] studied to what extent the membership inference attacks can be successful against generative models. Truex et al. [24] studied the model vulnerability through proposing a generalized formulation of a blackbox membership inference attacks using different model combinations and multiple datasets. Melis et al. [25] studied the possibility of successful membership inference attack in distributed learning systems.
Iii Composite Face Reconstruction Attack
In this section, we show that, how the attackers can take advantage of composite faces to design more sophisticated attacks. Among all possible attack strategies [32], we discuss Random Search as the most imminent one.
Iiia Composite Faces
Construction of the composite faces has a long history in crime detection [29]. In this paper, we assume that, the attacker takes advantage of available face composite softwares to collect sufficient face composition parts to design an efficient face reconstruction attack. We suppose that, the attacker collects the face composition parts from published softwares. To simulate such kind of threat, we used a free software called PortraitPad [26]. The collected face composition dataset includes one base head, 100 eyes, 73 lips, 34 noses , 50 brows , 25 hairs , 10 glasses and 10 mustaches. Figure 1 shows some of the collected composition parts.
IiiB Attack Specifications
In this section, we define the specifications of face reconstruction attacks based on composite faces.
In such sophisticated attacks, the exact reconstruction of target face is impossible. Rather, the goal of attacker is to find the most similar face to the target face. This relaxation of the problem makes the threat even more dangerous. The main operation in this context is to search the state space to maximize the confidence score returned from the target retrieval model as shown in Figure 2. We can express this operation as follows.
. Where, is reconstructed face and is target face.
We suppose that, the attack has no any information regarding the target model structure. That’s why we can consider it as a black box attack. No matter what’s the type of target model. It can be supposed as either a face recognition model or face retrieval model. In either cases, such kinds of attacks can be considered as a significant threat to challenge the privacy of data, since the attacker tries to search and reconstruct the most similar face to the target one. Among all possible attack strategies to search the latent space, we discuss random search.
IiiC Random Search
Random search [31] is kind of optimization methods which can find the global extrema without optimizing the gradient of problem. Random search was proposed by Anderson [35] and then extended by Rastrigin [36] and Karnopp [37]. Random search techniques are very useful when there is several local extrema in the state space. In case of using the face composition parts for face reconstruction attack, the attacker may select this approach to avoid rounding policy as counter measure. Suppose denotes the number of face composition parts. The size of state space or number of possible synthesized faces is . It’s clear that, in worst case, the attacker needs to search all the state space which is so time consuming. Even the worst scenario from attacker point of view sounds very daunting from privacy point of view because at the end attacker would find the most similar synthesized face to the target face. Needless to say that, random search makes this process easier for the attacker. However, we need to prove that random search can guarantee the convergence.
We can define random search problem as follows. Given a target model from to and state space , we search for a point in which maximizes confidence score returned from .
The conceptual steps for random search are as follows [31] .
Step1 Find in and set k = 0.
Step2 Generate from the sample space .
Step3 Set , choose , set k = k + 1 and return to Step 1.
The map with domain and range satisfies the following condition.
and if
The
are probability measures corresponding to distribution functions defined on
THEOREM 1. (Convergence Theorem [31]) Suppose that is a measurable function, is a measurable subset of and is satisfied. Let be a sequence generated by the algorithm. Then
Where,
is the probability that at step , the point generated by the algorithm is in .
PROOF
From it follows that, or in implies that, for all Thus
(1) 
and hence
(2) 
IiiC1 Stopping criteria
Construction of a sequence is the best scenario. However, we must define a stopping criteria [31] enabling us to stop the algorithm after a finite number of iterations.
Given we need to find such that, for all , then
(3) 
Choosing an integer has the required property since for it follows that
and hence .
Iv Countermeasure
In this section, we explore one potential approach for developing effective countermeasure against proposed face reconstruction attack based on Maximum Likelihood Estimators.
Iva Countermeasure Strategy
In case of using random search, the attacker needs to change the input parameters of PGGAN or TLGAN to find the most similar face to the target face. As mentioned earlier, PGGAN and TLGAN can’t guarantee high quality images per all input parameters. The best possible solution to take advantage of this weakness is to return 100 % confidence score if the submitted face is detected as the fake face as shown in Figure 3. This strategy would force the attacker to take wrong search path to reach the optimal face.
IvB Face Detection Score Filtering
For face detection, image patches
of different sizes and from different positions in the input image are transformed to eigenspace and the Maximum Likelihood Probability (MLP) is estimated as follows [38].
(4) 
Where, is the transformed input image to eigenspace and
is the corresponding eigenvalues,
is the number of eigenfaces used for estimation, andapproximates the distance from feature space. For face recognition, the MLP of similarity between images is calculated as follows [38].
(5) 
Face Detection Score (FDS) is considered as likelihood probability of detecting a face in input image returned by trained model. During our experiments, we observed that, Face Detection Score (FDS) of composite faces is lower than real faces in majority of cases. Based on this observation, we propose a voting system which enables the face recognition systems to detect a composite face. The proposed voting system is based on comparing the submitted face with a set of randomly selected real faces. If the submitted face fails to get higher FDS in cases, it’s considered as composite face which has been submitted for the sake of reconstruction attack, where is a predefined threshold.
IvC Face Detection Score Filtering (FDSF)
In case of using random search, the attacker needs to change the face composition parts to find the most similar face to the target face. As mentioned earlier, a synthesized face can’t guarantee high FDS per all input parameters. To take advantage of this weakness, we return 100 % confidence score if the submitted face is detected as the fake face. This strategy would force the attacker to take wrong search path to reach the optimal face. As a result, the reconstructed face by the attacker would significantly differ from the target face.
V Experiments
In our experiments, we supposed that, the attacker has no information regarding the structure of target model. To reconstruct a target face , the attacker submits a face image generated by composition parts and the returned confidence scores from Betaface model [28] is used to search the state space to find the most similar generated face to target face.
Va Dataset
Betaface [28] enables us to browse the synthesized faces in wide range of faces. Considering the privacy of retrieved faces, we decided to limit our evaluations to Celebrities dataset which contains more than 40k faces of famous people. To evaluate the proposed countermeasure, we also used WLFDB [33] and Pubfig [34] datasets.
VB Vulnerability of Face Recognition Systems
In this section, we show that, online face recognition systems are extremely vulnerable against composite faces. We show that, even low quality composite faces can receive a high confidence score from the target model. To do so, we take advantage of an online face classifier called Betaface [28] which provides verification (faces comparison) and identification (faces search) services. The API enables the developers to extract face general information including positions, sizes, angles and 123 face landmarks locations. In our experiments, we tried to check to what extent a synthesized face by composition parts can cheat a face recognition system. Figure 4 shows that, the target model can correctly find the face landmarks in complete synthesized faces. Partial faces are even more challenging since they contain only eyes and nose. Even in this case, the target model has found the eye and nose landmarks correctly. This experiment reveals that, face recognition systems are defenseless against composite faces.
Figure 5 shows three composite faces and their corresponding retrieved faces. This experiment proves that, even low quality composite faces can be recognized by online models [28] with high confidence score.
VC Attack Evaluation
To evaluate the quality of reconstructed face, we classify the reconstructed face and target face based on age, gender, ethnicity, smile and a set of high level face characteristics. To do so, we take advantage of an online face classifier called Betaface [28] which provides verification (faces comparison) and identification (faces search) services. The API enables the developers to extract face general information including positions, sizes, angles and 123 face landmarks locations. Figure 10 shows three different test cases as target faces and corresponding reconstructed faces.
VC1 Test case (a)
To compare the similarity of target face (a) and its reconstructed version, 17 different high level characteristics are used as shown in in Table I, II. Experimental results show that, 12 characteristics are categorized correctly the same in target and reconstructed images in test case (a). The most unexpected result is related to Beard and Mustache. In both cases, the Betaface has detected mentioned items in reconstructed image but not in target face. Figure 6 shows the probability of classification results belonging to test case (a).
VC2 Test case (b)
To compare the similarity of target face (b) and its reconstructed version, 17 different high level characteristics are used as shown in in Table III, IV. Experimental results show that, 9 characteristics are categorized correctly the same in target and reconstructed images in test case (b). The proposed method has failed to reconstruct the expression of target face but the detected age is rather close to target face. Figure 7 shows the probability of classification results belonging to test case (b).
Age  Arched Eyebrows  Bald  Beard  Big Lips  Big Nose  Double Chin  Gender  Narrow Eyes  

Reconstructed Face  28  No  Yes  Yes  Yes  Yes  No  Male  No 
Target Face  44  No  Yes  No  No  Yes  No  Male  No 
Attractive  Bags Under Eyes  Expression  Mustache  Pointy Nose  Bushy Eyebrows  Bangs  Glasses  
Reconstructed Face  No  No  Neutral  Yes  No  Yes  No  No 
Target Face  No  No  Neutral  No  No  No  No  No 
Age  Arched Eyebrows  Bald  Beard  Big Lips  Big Nose  Double Chin  Gender  Narrow Eyes  

Reconstructed Face  24  No  No  No  Yes  Yes  No  Male  No 
Target Face  33  No  No  No  No  No  No  Male  Yes 
Attractive  Bags Under Eyes  Expression  Mustache  Pointy Nose  Bushy Eyebrows  Bangs  Glasses  
Reconstructed Face  Yes  No  Neutral  No  No  Yes  Yes  No 
Target Face  No  No  Smile  No  No  No  No  No 
Age  Arched Eyebrows  Bald  Beard  Big Lips  Big Nose  Double Chin  Gender  Narrow Eyes  

Reconstructed Face  34  No  Yes  No  No  Yes  No  Male  No 
Target Face  45  No  Yes  No  No  No  No  Male  No 
Attractive  Bags Under Eyes  Expression  Mustache  Pointy Nose  Bushy Eyebrows  Bangs  Glasses  
Reconstructed Face  No  No  Neutral  No  No  No  No  Yes 
Target Face  No  No  Neutral  No  No  No  No  Yes 
VC3 Test case (c)
To compare the similarity of target face (c) and its reconstructed version, 17 different high level characteristics are used as shown in in Table V, VI. Experimental results show that, 15 characteristics are categorized correctly the same in target and reconstructed images in test case (c). Figure 8 shows the probability of classification result belonging to test case (c). Here, we summarize the experimental results:

The best returned confidence score at each generation is significantly higher than previous generation as shown in part (b) of figure 7. It proves that rounding policy as countermeasure can not stop evolution based reconstruction attack.

Reconstructed faces are similar to target faces in terms of the majority of face characteristics.

Proposed method fails to reconstruct a face with exactly the same age as target face. Researchers might exploit this fact to devise new countermeasures against evolution based face reconstruction attacks.
VD Evaluation of Countermeasure
To evaluate the proposed countermeasure, we supposed that, the attacker uses a searchbased method like random search to avoid rounding policy as countermeasure. To simulate the attack, we took advantage of face comparison tool of Betaface [28] which returns the confidence score of face similarities. To test the FDSF to detect the fake faces, we compare each composite face against six randomly chosen real faces. The random real faces are selected from WLFDB [33] and Pubfig [34] datasets. Our experiments show that, the composite faces would always get low rank in terms of FDS as shown in Figure 10. In this figure, each label represents the ranking of corresponding face in terms of FDS. In all the experiments, we set the threshold to which means that, submitted faces are detected as composite face if their FDS rank is less than 5. Consequently, a high confidence score is returned to the user. Our experimental results show that, FDSF is very successful to fool the attacker by returning high confidence scores for detected fake faces as shown in Figure 9. Figure 11 shows three different cases which each one includes target face, reconstructed face without countermeasure and reconstructed face with countermeasure. To measure the impact of proposed countermeasure, we calculated the similarity of both reconstructed faces with and without countermeasure as shown in Figure 12. The results show that, the proposed countermeasure can significantly decrease the similarity of reconstructed face and target face.
Vi Conclusion
In this paper, we showed that, rounding policy as a countermeasure can not guarantee the privacy of face recognition models. Composite faces have been used for many decades for good reasons. However, they can easily be exploited to violate the privacy of individuals whose face images have been used as training data. We showed the vulnerability of face recognition systems to preserve the privacy of training data. We also proposed a new countermeasure to stop proposed face reconstruction attack based on face detection score (FDS). Our experiments showed that, the composite faces fail to outperform real faces in terms of FDS in majority of cases. We showed that, the proposed counter measure is able to detect the composite faces using a voting system to compare the FDS of submitted faces and a set of random real faces.
References
 [1] Ribeiro, Mauro, Katarina Grolinger, and Miriam AM Capretz. ”Mlaas: Machine learning as a service.” 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA). IEEE, 2015.
 [2] Kesarwani, Manish, et al. ”Model extraction warning in mlaas paradigm.” Proceedings of the 34th Annual Computer Security Applications Conference. ACM, 2018.

[3]
Turk, Matthew A., and Alex P. Pentland. ”Face recognition using eigenfaces.” Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 1991.
 [4] Turk, Matthew, and Alex P. Pentland. ”Face recogn8tion system.” U.S. Patent No. 5,164,992. 17 Nov. 1992.
 [5] Fredrikson, Matt, Somesh Jha, and Thomas Ristenpart. ”Model inversion attacks that exploit confidence information and basic countermeasures.” Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. ACM, 2015.
 [6] Lee, Sukhan, and Rhee Man Kil. ”Inverse mapping of continuous functions using local and global information.” IEEE transactions on neural networks 5.3 (1994): 409423.
 [7] Linden and Kindermann, “Inversion of multilayer nets,” in International 1989 Joint Conference on Neural Networks, 1989, pp. 425–430 vol.2.
 [8] Lu, BaoLiang, Hajime Kita, and Yoshikazu Nishikawa. ”Inverting feedforward neural networks using linear and nonlinear programming.” IEEE transactions on neural networks 10.6 (1999): 12711290.
 [9] Mahendran, Aravindh, and Andrea Vedaldi. ”Understanding deep image representations by inverting them.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
 [10] Dalal, Navneet, and Bill Triggs. ”Histograms of oriented gradients for human detection.” international Conference on computer vision and Pattern Recognition (CVPR’05). Vol. 1. IEEE Computer Society, 2005.
 [11] Lowe, David G. ”Distinctive image features from scaleinvariant keypoints.” International journal of computer vision60.2 (2004): 91110.
 [12] Fredrikson, Matthew, et al. ”Privacy in pharmacogene 8tics: An endtoend case study of personalized warfarin dosing.” 23rd USENIX Security Symposium (USENIX Security 14). 2014.

[13]
Mahendran, Aravindh, and Andrea Vedaldi. ”Visualizing 9 deep convolutional neural networks using natural preimages.” International Journal of Computer Vision 120.3 (2016): 233255.
 [14] Dosovitskiy, Alexey, and Thomas Brox. ”Inverting 10 visual representations with convolutional networks.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
 [15] Dosovitskiy, Alexey, and Thomas Brox. ”Generating images with perceptual similarity metrics based on deep networks.” Advances in neural information processing systems. 2016.
 [16] Nash, Charlie, Nate Kushman, and Christopher KI Williams. ”Inverting supervised representations with autoregressive neural density models.” arXiv preprint arXiv:1806.00400(2018).
 [17] Yang, Ziqi, EeChien Chang, and Zhenkai Liang. ”Adversarial Neural Network Inversion via Auxiliary Knowledge Alignment.” arXiv preprint arXiv:1902.08552 (2019).
 [18] N. Homer, S. Szelinger, M. Redman, D. Duggan, W. Tembe, J. Muehling, J. V. Pearson, D. A. 2Stephan, S. F. Nelson, and D. W. Craig. Resolving individuals contributing trace amounts of dna to highly complex mixtures using highdensity snp genotyping microarrays. PLoS Genet, 4(8):e1000167, 20 2208.
 [19] Backes, Michael, et al. ”Membership privacy in MicroRNAbased studies.” Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, 2016.
 [20] Homer, Nils, et al. ”Resolving individuals contributing trace amounts of DNA to highly complex mixtures using highdensity SNP genotyping microarrays.” PLoS genetics 4.8 (2008): e1000167.
 [21] Pyrgelis, Apostolos, Carmela Troncoso, and Emiliano De Cristofaro. ”Knock knock, who’s there? Membership inference on aggregate location data.” arXiv preprint arXiv:1708.06145(2017).
 [22] Shokri, Reza, et al. ”Membership inference attacks against machine learning models.” 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 2017.
 [23] Hayes, Jamie, et al. ”LOGAN: Membership inference attacks against generative models.” Proceedings on Privacy Enhancing Technologies 2019.1 (2019): 133152.
 [24] Truex, Stacey, et al. ”Towards demystifying membership inference attacks.” arXiv preprint arXiv:1807.09173 (2018).
 [25] Melis, Luca, et al. ”Inference attacks against collaborative learning.” arXiv preprint arXiv:1805.04049 (2018).
 [26] http://portraitpad.com/
 [27] Schroff, Florian, Dmitry Kalenichenko, and James Philbin. ”Facenet: A unified embedding for face recognition and clustering.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
 [28] https://www.betafaceapi.com/demo.html
 [29] Klum, Scott, et al. ”Sketch based face recognition: Forensic vs. composite sketches.” 2013 international conference on biometrics (ICB). IEEE, 2013.

[30]
Huang, Gary B., et al. ”Labeled faces in the wild: A database for studying face recognition in unconstrained environments.” Workshop on faces in RealLife Images: detection, alignment, and recognition. 2008.
 [31] Solis, Francisco J., and Roger JB. Wets. ”Minimization by random search techniques.” Mathematics of operations research 6.1 (1981): 1930.

[32]
Zhai, W., P. Kelly, and WB. Gong. ”Genetic algorithms with noisy fitness.” Mathematical and computer modelling 23.1112 (1996): 131142.
 [33] Wang, Dayong, S. Hoi, and Jianke Zhu. ”Wlfdb: Weakly labeled face databases.” Technical Report. 2014.
 [34] Dayong Wang and A. K. Jain, ”Face Retriever: Prefiltering the Gallery via Deep Neural Net”, ICB, Phuket, Thailand, May 1922, 2015.
 [35] Anderson, Richard Loree. ”Recent advances in finding best operating conditions.” Journal of the American Statistical Association 48.264 (1953): 789798.
 [36] Rastrigin, L. A. ”The convergence of the random search method in the external control of a many parameter system.” Automaton and Remote Control 24 (1963): 13371342.
 [37] Karnopp, Dean C. ”Random search techniques for optimization problems.” Automatica 1.23 (1963): 111121.

[38]
Günther, Manuel, and Rolf P. Würtz. ”Face detection and recognition using maximum likelihood classifiers on Gabor graphs.” International Journal of Pattern Recognition and Artificial Intelligence 23.03 (2009): 433461.
Comments
There are no comments yet.