Machine learning (ML) gained a great success because of its high accuracy and convenience to implement. ML outperforms both human beings and traditional programs for a lot of problems, like face recognition, board gameetc., which convinces people to confidently integrate ML models to their systems. Apart from high accuracy, ML is easy to implement. All developers need to do is to collect and label a bunch of data. Then the ML model can automatically learn to handle future inputs though training.
When developing a system with ML, developers usually outsource all ML model related workload to an independent and professional ML team and employ ML models like using a third party library. The outsourcing style is chosen mainly because ordinary developers are lack of data science background knowledge and are hesitated to learn methods to handle ML model.
The ML outsourcing accelerates product development process and requires developers only little ML background. Usually, they need only to comprehend the format and meanings of the model input and output. In this way, both teams can focus on their expertise without getting too much coupling. This style turned to be a prevalent way for integrating ML to systems.
However, ML outsourcing introduces potential threats because of developers’ abusage rooted from unfamiliarity with ML. Different from employing a third party library, ML models and their output values are totally foreign to developers. For the third party library case, the return value and functionality of each interface can be easily understood by developers and usually are documented in detail. By contrast, ML models, though sometimes also come with manuals and documentations, are not fully understood by developers in terms of security implications. Actually, even ML model designers themselves do not have enough understandings to the security implications of their works.
Our work takes face embeddings as an example to understand the risk behind such a development process. Embeddings are a widely used principle in ML to handle a wide range of problem. In ML, embeddings are used to map complex data into a unified space (usually a fixed dimension vector) with specific properties remained, so the data can be further more easily processed in the simpler space. Many complex problems firstly employ embeddings to allow later ML or non-ML operations on the unified space. For example, face search employs face embedding to map a portrait photo to a face embedding, so the embedding can be used to be compared with other vectors in database by using a simple and well-known distance like L2 distance or cosine distance. Face embeddings are chosen as our focus because 1) Embeddings are often the interface between ML and non-ML. Security issues often root from the transition between entities. 2) Embedding is the most ML concept that developers must know before they can operate the model. Complex ML details are all hidden to developers but embedding is a must-know concept before they can further use embeddings to achieve their required functionality. However, according to our study, embedding is still too ML and developers may not fully comprehend the implications behind. 3)Embeddings are so widely used in solving problems like CV, NLPetc..
To understand the risk, we firstly studied how developers may leak embeddings. Developers are assumed to operate embeddings provided by ML models, like calculating distance, storing or transmitting these embeddings. We investigate if there are possibilities for attackers to infer or directly get embeddings during these operations. Then we study the consequence brought by leaked embeddings. Embedding implies a mapping from a complex space to simple space for easier operation. We consider if there are possibilities for attackers to reverse the mapping i.e. get a mapping from the simple space back to the original complex space.
Our first finding is that developers are prone to leak embeddings accidentally in many ways mainly because they regard embeddings as insensitive and do not comprehend its ML implications. The most threaten way we found is leaking through distance calculation, i.e. developers calculate and leak the distance from a user fed input to a sensitive embedding. We discussed with several developers who had experience in using embeddings but none of them thought it insecure or incorrect. However, according to our analysis, an embedding can be directly inferred if enough amount of distances from the embedding are known by attackers. Besides, embeddings may also leak through improper storage or networking transmission.
The second finding is that actually embeddings are as equal sensitive as original input, because with our designed model, the original input generating an embedding can be recovered if only the embedding is acquired. To achieve this, we designed a GAN-like model that learns the mapping between the complex input space and the embedding space. Moreover, the model tries to reverse the mapping and generate a valid and corresponded input against a given embedding. The evaluation results show that the recovered input images come with nearly no quality loss.
The two findings together imply high risks of embedding leakages. The recovery model abridged the embedding and the original input, while the first finding linked the embedding with non-ML leakages. Being combined together, the two findings imply that attackers can recover original input if only they acquire some leakages that are results from developers’ carelessness, like the distances. The original input definitely is highly sensitive and must be well-protected against any leakage. Ultimately, the embedding leakage implies high risks.
To avoid any further embedding leakages, we suggest that ML library developers must clearly state in the documents related to getting embeddings that the embedding is as equal sensitive as original image input and any result calculated with embeddings should not be freely exposed to users or third parties. Beside, we suggest ML researchers design embedding models that cannot be easily reversed.
We summarize our contributions as follow:
We identified that developers may accidentally leak embeddings through various channels, because from developers’ perspective, embeddings are not that sensitive.
We discovered that leaking distance to embeddings is equal to leaking embeddings.
We discovered that leaking embeddings is nearly equal to leaking the original input for face recognition models.
We designed a GAN based recovery model to map face embeddings to portrait photos. With the model we showed that popular embeddings can all be recovered, for 4 well-known face embedding models.
2. Adversary Model
We mainly have two adversary models presented in the work: black box and white box.
2.1. Goal of Adversaries
Adversaries may target specific victims or just hang out to hunt random victims. Adversaries are assumed to be interested in their victims’ privacy, especially appearance. Specifically, adversaries want the profile photos of their victims.
2.2. Common Assumptions
Embedding. We assume that adversaries can always get the embedding of any image they provide.
This assumption usually holds because we assume ML models are included in a manner of public service that whoever can freely use it. For example, an app has a face embedding model inside, so an attacker can reverse engineer the app to get the model and use the model to generate embeddings for any images she wants. Or, if an app uses an online ML model, an attacker can reverse engineer the protocol first and then inject an image to the model to get the embedding.
Acquiring Leakages. We assume adversaries can directly or indirectly acquire some leaked embeddings of their target. Developers leak embeddings in many way, which will be introduced in later sections. These leaking channels can be categorized into direct and indirect channels.
Adversaries may leak embeddings without being aware of that. Because of developers’ shortage of ML background, developers may inadvertently leak some calculation immediate results related to embeddings, which may be highly sensitive as we will later show. In this case, attackers can reconstruct embeddings with these leakages.
Adversaries may acquire embeddings directly because the developers transmit, store or manipulate embeddings in an insecure manner, which leaves opportunities to attackers to directly intercept embeddings. For example, we will later show that developers may use HTTP protocol to start service related to ML. In this case, attackers can readily use off-the-shelf sniffing tools to intercept embeddings from network.
Image Database. Adversaries are assumed to own a huge database of human faces, which can be easily achieved today. He can download open data-set for CV directly. A lot of such data-set can be acquired, including LFW and CASIA-WEBFACE.
2.3. Black Box Adversaries
Adversaries need not to know anything about the embedding model, which indicates black box scenario. Adversaries can acquire embeddings of their victims but have no idea what ML model generated these embeddings. They do not know what structure the model uses, how many neurons the model has, not to mention the parameters of the model.
This could happen if an adversary targets online model or if she is not able to reverse engineer the target service or protocol. Today, a lot of ML enterprises provide online ML service to other enterprises or users. In this case, the model has no any public documentation that can direct adversaries to recover victim’s appearance.
2.4. White Box adversaries
Sometimes, an adversary may know the target model in detail, including structure and even parameters. In this scenarios, she is expected to recover images with better quality.
In some cases, ML models are embedded into products, like embedded into an app, in which case adversaries can extract the ML model out and have analysis to the model.
3. Embedding Leakage
In this section, we show that how embeddings can be leaked by careless developers. The first category, i.e. distance based leakage, is because of developers’ unfamiliarity with ML and mathematics while the later categories are a result of unawareness of embeddings’ sensitivity.
3.1. Leakage through Distance
Embeddings are geometric points. As mentioned before, embeddings enable developers to more easily compare a pair of complex data by allowing directly calculating distance (norm of difference) of a pair of embeddings. The calculated distance, by expectation, reflects the similarity of the original complex data pair. All these are achievable because embeddings are regarded as geometric points.
Distances are leaked. However, the calculated distance, from developers’ perspective, is not sensitive, according to our observation in a lot of cases. As a result, they sometimes expose the calculated distance in unnecessary places.
An interesting distance leakage example we encountered is a self-service machine that was deployed by a government. This machine authenticates users with their faces before they can touch critical functions. To use this machine, a user need firstly enter his ID number and then stare at the camera that will then take a photo for the user. The photo will be compared with profile photo stored in the government’s database. However, the machine directly displays the similarity between the current user and the profile photo belonging to the claimed user.
Embeddings can be recovered from distances. As we identified, if only an attacker acquires enough amount of distances to a sensitive embedding, the embedding can be recovered.
Let us consider the simplest occasion. Assuming that there is an unknown point (an embedding with two dimensions) in a 2D plane, if only other two points ( and ) and their distances ( and ) to the point are known, the coordinates of can be worked out, because must be located at the intersection of two circles: One centered in with radius , another centered in with radius . Intersection of two circles can be either two points or one point or just no intersection. Because we already know that there is at least one point in the intersection, there can be either one point or two points in the intersection and our aim is just one of them. As a result, if only , , and are known, the is definitely also known.
Given that a n-D embedding is used by developers, once distances to other n embeddings are known, the embedding is definitely leaked. Embeddings are also points, so if only some distances to an embedding is known, the embedding is known. The only difference between embeddings and 2D points is that embeddings are of higher dimension.
Solve equation to recover embeddings. The attacker can solve equations to get the embedding. Every time an attacker knows a distance to the embedding, she knows that the embedding is on a (n-1)-sphere. When she accumulated some different distances, she knows that the embedding is on the intersection of them. From the perspective of algebra, she gets an equation about the embedding for each (n-1)-sphere and the embedding is just the root of the equation system (shown by equation system .1), where is the unknown embedding, and being the known embedding and its distance to .
Embedding systems usually use either cosine distance or Euclidean (L2) distance. Specifically, for embedding systems that are evaluated with distance, the equation system attackers need to solve turns to equation. 2. If equation. 2 has and only has few (like one or two) roots like the (circle) case, embeddings can be recovered by solving the equation. If the equation has roots, the embedding still cannot be recovered, because attackers do not know which one is the embedding among these roots.
Therefore, the number of roots of equation. 2 decides if attackers can successfully recover embeddings. Unfortunately, as we will later show, equation. 2 has at most 2 different roots, indicating that attackers can readily recover embeddings.
To know how many different possible embeddings attackers will have with distances, we firstly check how many roots equation. 3 has. Because it is an ordinary linear equation system, it has only one root (shown by equation. 4) given that all are linearly independent and is a newly introduced constant that is independent of . If the root of equation 3 wants to be also the root of equation. 2, it need only also satisfy equation. 5, as equation. 3 becomes to equation. 2 if only is replaced by . However, such has only up to two roots, as equation. 5 is an ordinary quadratic equation about which is a scalar. As a result, can also have up to two roots, as shown by equation. 6.
Specifically for embedding systems with cosine distance as metric, things are similar. Equation. 8 tells an attacker how to calculate the embedding when she has different distances, because the embedding must satisfy equation .7. In this case, the norm of the embedding gets lost, but the norm of embedding in cosine embedding systems is not important because two parallel embeddings will be regarded as exactly the same embeddings.
Attackers can also employ numerical analysis way to solve equation. 2 and equation. 7 without knowing any linear algebra background. Scientific computing tools like Matlab and Wolfram Alpha all provide straightforward UI to allow attackers enter the equation and get numerical roots shortly.
In a word, when an attacker accumulated distances to an embedding, she can always recover the embedding, no matter if the embedding is in Euclidean system or cosine system.
Recovering anyone’s profile photo embedding. Putting into the self-service machine case, an attacker can print say like 1000 photos of different 1000 people. Every time she wants to recover somebody’s profile photo in government’s database, she needs only go to the self service machine and enter the victim’s ID number, then present the 1000 photos one by one and keep track of the displayed similarities. At last, she solves the equation to recover the embedding. Definitely, the embedding can then be sent to the model we will introduce later to recover the profile photo.
To be noticed is that the method eventually can help attackers dump the whole profile photo database of the nation. Theoretically, the attacker can enumerate all ID numbers and repeat the attack. Threaten is that the attacker knows nothing about the victim in this case.
3.2. Leakage through storage and network
To be contibuted…
4. Embedding Recovery Model
In this section, we introduce the recovery framework we designed as well as the training process to recover original input images for attackers’ acquired embeddings.
Inspired by GAN and VAE (Variational Auto Encoder), we designed the framework shown by Figure. 1.
The framework mainly consists of a brand new embedding to image generator, a discriminator and several loss functions.
We assumed that attackers own a bunch of images to train a recovery model. The images are firstly devoted to the target face embedding model either by querying a online model or directly passing through the model if the model is accessible for the attackers. After that, the attackers have the embeddings corresponding to their images.
The embeddings instead of randomly generated noises are sent to the generator, which is a main difference from ordinary GAN framework and also a key innovation of our work. The generator generates images for each input embedding. Then the generated images (IMG_g) are then sent to different modules for loss generation.
The generated images are used for three kinds of loss generation: discriminator loss, recovery loss and embedding loss. The three loss items are then used to direct update the generator.
The generator does not follow classical GAN generator design. We also tried to fit classical structures to the problem setting but found none performing well.
The generator takes embeddings as input instead of noises because in our case, we expect the model to have generalization capability over embeddings. Ordinary GANs have generalization capability over noise field, but does not limit the output, resulting in meaningful but not relevant images. However, in our setting, we need the generated images exactly corresponded to their input embeddings. Conditional GANs exploit the noises to have generalization capability over the newly added noise field under the constraint of the label. In our problem settings, the generator will never be used to generate different images of the same embedding, so we do not need this kind of generalization. In contrast to GAN family, our generator has generalization over embeddings and also output corresponded face images. Figure. 2 illustrates the difference between our generator and the generator of the other two mentioned GANs.
The generator we designed has two phases: multi path phase and single-path phase. Figure. 3 shows the design of our generator for 512 dimension embedding input.
The first phase, i.e. multi path phase tries to extract information from the input embedding in different paces, aiming at adapting embeddings generated by models of different types. For our 512 dimension embedding recovery case, the rapid branch directly deconvolutes the embedding from 512 dimension to 512 10*10 tiny images. In contrast, the mild branch firstly deconvolutes it into 2*2 and then 10*10. The different paces results in different grade information extraction. These branches are then combined together after they reach the same size, providing a unified process method for the later phase.
The second phase, single path phase, generates gradually clearer and larger images by concentrating vectors. The data repeatedly passes deconvolution unit following by a residual convolution unit (Figure. 4). The deconvolution unit enlarges the generated images by fusing multiple channels, during which the size of the images is doubled while the channels halved, followed by the residual unit which rectifies the images without changing the image size.
4.3. Recovery Loss
To encourage the generator generate images corresponding to their embeddings, we add a loss item forcing the generated image (IMG_g) similar with the original image (IMG). The loss penalizes the difference between IMG and IMG_g. Equation. 9 shows the recovery loss.
Here we use norm to measure the loss instead of because we found pays more attention to penalizing background comparing with . cares more about larger difference while usually the background part varies more from one image to another than the face part of an image.
4.4. Embedding Loss
To assist recovery loss (L_r) to better stimulate the generator to recover images, we also employ the face embedding model to do so. We deliver the generated images (IMG_g) to face embedding model to get their embeddings (Emb_g). The embedding loss just measure the difference between the input embeddings and the embedding of the generated images, as shown by equation. 10.
The type of norm used by depends on the norm the face embedding model chooses, as it describes the difference between embeddings best.
To be noticed is that is an optional loss for attackers because in some cases the attacker cannot get the face embedding model for training. For the white box adversary model, attackers can readily put the embedding model together to train. However, an black box attacker can only query an online model, she cannot put the embedding model here to construct , simply because an online model does not provide gradient which is compulsury for updating the parameters of the generator. Specifically, given and , attackers has no way to know . In this case, is directly set to constant 0.
To also allow black box adversaries experience the assistance from embedding loss, an attacker can either 1) train a substitutional model to approximate the embedding model to construct or 2) use open source model instead to construct .
For the first case, attackers can employ the teaching-student method(Hinton et al., 2015) to construct a model from the embeddings and images such that the model has approximately same behaviour with the target model. Thus, with to and to , the attacker can train a model such that to are all approximately equal to to . Then the attacker can use to substitute the in equation. 10 to get gradient and direct the updating, expecting that is also an approximation to .
For the second, if only an model is also face embedding model, the model can be used to measure the quality of the generated images, though it may have utterly different characteristics from the original model . The attacker can expect an open source model to help evaluate the similarity between the generated images and the training image, by simply adding to loss the norm of the difference of embeddings. Thus, they expect has some positive relationship with .
The discriminator guarantees that the generated images indeed look like images containing faces.
The discriminator we employed follow the standard discriminator from WGAN-GP. The loss is directly used as our discriminator loss(
). WGAN-GP needs to maintain Lipschitz function to calculate the Wasserstein distance. To do so, it penalizes gradient for every independent sample. Specifically, the discriminator we use drops all batch normalization layer (BN), and after every convolutional operation, we add a small Residual Block just like our generator. In the end, the output of the model will be a scalar which is the confidence the discriminator thinks the input is real.
4.6. Training Process
We follow the GAN training process to train our model, i.e. training generator and discriminator in turn. When training the generator, 3:1:1 to 2:1:1 was found to be the best ratio for , and
of the loss of generator. And learning rate is decayed 0.02 for every epoch. We train the generator five times after every single discriminator training.
We evaluate the recovery model we designed in this section. To understand how much risk embedding leakage is, show the images recovered by the recovery model from real worldembeddings in this section.
5.1. Target Embedding Model
We chosen four face embedding models as our target. They are a self trained Wide Inception Resnet, Clarifai online face embedding model, the current version facenet and an old version facenet.
|512||Cosine||0.63||97.6%111We used dlib as alignment tool, which resulted in lower accuracy than that shown on git.|
|128||L2222Facenet required squared L2. we use L2 instead, as they are equivalent for comparing only.||1.28||97.1%333We used dlib as alignment tool, which resulted in lower accuracy than the that shown on git.|
Self trained Resnet model.
To evaluate models trained by small enterprises, we tried to train a model with little hard work, which is what developers of small enterprises with ML demand want. We trained the model of popular network structure and with open face dataset. The model we chose is inception-resnet-v1. We trained the model with CASIA-Webface dataset, and tested it on the LFW dataset. We added cross-entroy loss over the “Additive Margin Softmax” after densing the embedding, which turns the model to a classifier for training. Because of the model was trained with dense and classifier, the embedding can be measured by cosine distance. The model got only moderate accuracy as no fancy Deep learning tricks were added, which imitates small enterprise models.
Online Model. Mainly to test if our recovery model can recover images under a pure black box adversary assumption, we add to our target list a totally black box face embedding model, i.e. online model. we surveyed popular online face embedding models and found Clarifai the most popular one444Clarifai ranked the first when Google searching “Face Embedding API”.. Clarifai provided almost all platform SDKs to allow developers directly access their server for embedding generation via simple and easy to use APIs, during which process the model is never exposed to developers.
Facenet. Facenet(Schroff et al., 2015) is the first well-known work for face embedding, as its triplet loss greatly improved face embedding performance. Facenet also is the most popular open source face embedding work555Facenet ranked the first when Google searching “Face Embedding”.. The most popular implementation of facenet was found at(Sandberg, 2019)666It appeared next to the paper of facenet when searching “Facenet” on Google.. According to the history of the git page, the author published two 128 dimension face embedding model and recently updated to the 512 dimension version. We take into consideration the current 512 Dimension model and one of the previous 128 Dimension models.
5.2. Evaluation Metric
To confirm if the model generated images indeed hurt embedding owners’ privacy, we try to measure the similarity between the original images and the generated images and use the similarity to quantify privacy leakage extent. Specifically, we define the recovery quality loss as the distance between the embedding (with Facenet 512D model) of an original image and its generated counterpart. If the distance is below the commonly used threshold (0.63 for our evaluation) for regarding two embeddings as from the same person, we regard the recovery quality loss negligible.
Besides, we also recruited five volunteers to subjectively rate the similarities between original face images and the recovered images, in case the model generate images to cheat face embedding models.
5.3. Experiment Setup
We set up a platform (Table. 2) to run all the following evaluation experiment.
We use most of images of LFW (around 12000) to train the recovery model while the left 384 images were used for success rate and quality loss testing. The time consumed to train a recovery model is around 10 - 13 hours.
To be noticed is that we have up to three models involved in a single training. The target model i.e. the model generating embeddings, the model for generation, the model for accuracy test which is always Facenet 512D is our case.
|58.00% 777Clarifai failed to return embeddings for some LFW images, so only 250 images were left for testing||0.5987||3.8||4.0|
5.4. Pure Black Box Recovery
We evaluated all the four models under the black box adversary assumption. For the three models which eventually we have the details about the models, we set the embedding loss () to constant as if attackers do not know the model and train our recovery framework. Then we train the recovery model to recover the embeddings generated by these four models respectively.
Evaluation Result. As we can see from Table. 3, the recovery model trained for Facenet 128 dimension model got the best overall recovery performance, while got the worst for the Clarifai model. The recovery quality for the two Facenet models are nearly perfect, though the other two are less surprising.
Table. 4 shows some recovered samples for each model, which indicates similar conclusion. For the cases with best recovery quality, i.e. Facenet 512D and 128D, the recovered images clears show faces that can be affirmatively regarded as another version of their victim owner. Even for the worst case, i.e. Clarifai, there are still considerable similarities between the recovered and original, indicating huge privacy threaten to embedding owners.
Subjective rating also goes with model metric. Our volunteers agreed that the recovered images against the two Facenet models have high fidelity and most (over 90%) of them can be thought as real photos taken for their owners.
Distance Distribution. The recovered image can be regarded as another image of the embedding owner. Figure. 5 shows the distribution of distances between two images from the same person, two images from different people, the pair of original and recovered images. As we can see, the threshold line can clearly tell apart the distances of the same person (left, smaller) and different people (right, larger). However, the distance between original image and recovered image has a nearly identical distribution with that of the same person, indicating that a recovered image can be regarded as from the embedding owner.
Recovery Performance vs embedding length. The recovery quality has nearly no correlation with embedding length. Theoretically, an embedding of higher dimension can deliver richer information. However, information delivered by embeddings includes a large amount of redundancy, indicating that the actual information entropy an embedding contains is way less than the entropy its container provides (embedding length).
The results coincides with Facenet’s results: embedding length does not imply a better recognition results. Their evaluation results showed that embedding of 128 dimensions already has the best performance while a longer 256 or 512 Dimension embedding does not yield a better recognition accuracy.
Embedding length does not limit our recovery quality, as our recovery model already got excellent performance for the 128D Facenet model and 128D is already the shortest one among well-known embedding schemes, even if embedding length somehow has correlation with recovery quality.
Recovery Quality vs Accuracy. The recovery performance neither has strong correlation with embedding model accuracy. It is believed that better recognition accuracy is a result of richer information delivered by embeddings. Similarly, the richer information can result in a better recovery quality. The results showed that this is not always true. As you may see from our evaluation result, the recovery for online Clarifai model has pretty poor quality while this model has a very high recognition accuracy. Except the Clarifai case, the rest three models approximately follow this rule. Facenet models have both better recognition accuracy and recovery quality than the self trained resnet model.
We still believe the recovery quality strongly depends on target embedding model’s accuracy. We analyzed Clarifai online model and found that, though the model yield a very high recognition accuracy, the distance between two embeddings from different people is still pretty low. Specifically, distance from different people and distance from the same person have close absolute value, though they have clear boundary so they can be told apart from each other. We believe this is the reason of relatively lower quality for the high accuracy model. Besides, we have no idea if Clarifai model was trained with LFW. We are sure that the other three models have never seen our data to train the recovery model, i.e. LFW. However we have no any knowledge about Clarifai model’s training process. So, it is possible that LFW was also used as part of their training data, which later interfered our training process, as training data are always over-fitting points of the model.
From another perspective, we believe the difficulties for recovering embeddings of different models are different. Embedding models’ capabilities to describe mappings are different, some are of higher while some are of lower. Usually, embedding network structures with higher describing capability result in higher recognition accuracy. While our recovery model only has limited capability to describe mapping. If the target model is so strong (higher accuracy) such that the recover model does not have enough capability to describe its inverse mapping, the recovery quality would be low.
5.5. Recovering Training Data
Usually people do not care prediction quality on training data. However we still evaluated the recovery quality on training data of the original model, because we found there are chances that ML developers devote later collected user data to fine tuning their model. As a result, user data which originally were test data become training data.
We used 1472 images from CASIA-WEB database (the training data of Facenet 512 model) and tested its recovery quality on the recovery model for the Facenet 512 model we got in section. 5.4. Surprisingly, the success rate 88.18% is even lower than that on unseen test data (92.19%).
We conjecture this is because the training data (target model’s) is too over-fitting for the target model. Specifically, the mapping between training data to its embedding and that of unseen data actually are slightly different, which results in better accuracy in training data for the target model. However, our recovery model training only learns the inverse mapping from the target model’s unseen data (our recovery model training data, not the training data for target model), resulting in inappropriate inverse mapping description on training data. As a result, our recovery model is not that suitable to recover training data (target model’s).
5.6. Substitutional Model Assisted Black Box
We simulated a black box adversary who uses an open source model to assist her recovery model training.
We take the latest Facenet implementation, i.e. Facenet 512D, as our target model. When training the recovery model, we use the old version Facenet 128D model to construct embedding loss, as if an attacker has no way to get the target model so finds an open source model instead.
|Q. Loss||Q. Inc.|
As we can see from Table. 5, the success rate and quality both increased, with the assistance of an older version model .
We believe the most important factor for the improvement is model diversity. With a different supervising the image similarity, it is more possible that the generator can generate images with less defects, given that every embedding model may neglect some important feature of the input image which may be a focus and be taken care by another model. It is possible that if even more models are added to supervise the generator training (by adding multipleof different model) process, the generator can perform better. However we also noticed that GPU memory may not be able to accommodate more models. We cannot even successfully put a 128D Facenet model to assist our generator for the 1024D and 1792D model, because of GPU memory limitation. We believe GPUs with more memory may help attackers achieve multiple model assisted training.
5.7. White Box Recovery
We also evaluated recovery quality under white box assumption. We trained the recovery model for Facenet 512D with constructed also by Facenet 512D.
Unfortunately, we ended up with a even worse results than black box model, as shown by Table. 5. It goes contrary with our common sense. Usually we believe a model works better under white box model because more information is given.
The only explanation we can find is that recovery model prone to over-fitting if it is also supervised by the target model. This explanation is coincide to the performance improvement in section. 5.6. The generator already exploited the mapping from embedding to image, while the opinion of another model as embedding loss actually corrects some points neglected by the target model. However, the neglected part would be impressed if the same model again interfere the mapping learning.
Considering the observation, a white box attacker would better hire another model with utterly different structure to assist recovery model training.
6. Related Works
Developers having not enough knowledge about ML, though still can well utilize ML, may cause severe privacy leakages to users. As our investigation shows, in the face embedding case, attackers can recover users’ appearance with very high fidelity, when only embeddings leaked. What’s worse, developers are not aware of the risks behind embeddings so may inadvertently leak embeddings directly or indirectly. We call the community to pay attention to the unobserved leakage and pushes developers to better comprehend ML knowledge to avoid such leakages in the future.
- Hinton et al. (2015) Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).
- Sandberg (2019) David Sandberg. 2019. The most popular facenet implementation and pre-trained model. https://github.com/davidsandberg/facenet. (2019). Accessed: 2019-01-20.
- Schroff et al. (2015) Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. Facenet: A unified embedding for face recognition and clustering. In . 815–823.