runs several layers of a deep learning model in TrustZone
We present DarkneTZ, a framework that uses an edge device's Trusted Execution Environment (TEE) in conjunction with model partitioning to limit the attack surface against Deep Neural Networks (DNNs). Increasingly, edge devices (smartphones and consumer IoT devices) are equipped with pre-trained DNNs for a variety of applications. This trend comes with privacy risks as models can leak information about their training data through effective membership inference attacks (MIAs). We evaluate the performance of DarkneTZ, including CPU execution time, memory usage, and accurate power consumption, using two small and six large image classification models. Due to the limited memory of the edge device's TEE, we partition model layers into more sensitive layers (to be executed inside the device TEE), and a set of layers to be executed in the untrusted part of the operating system. Our results show that even if a single layer is hidden, we can provide reliable model privacy and defend against state of the art MIAs, with only 3 TEE, DarkneTZ provides model protections with up to 10READ FULL TEXT VIEW PDF
runs several layers of a deep learning model in TrustZone
Advances in memory and processing resources and the urge to reduce data transmission latency have led to a rapid rise in the deployment of various Deep Neural Networks (DNNs) on constrained edge devices (e.g., wearable, smartphones, and consumer Internet of Things (IoT) devices). Compared with centralized infrastructures (i.e., Cloud-based systems), hybrid and edge-based learning techniques enable methods for preserving users’ privacy, as raw data can stay local (Osia et al., 2018). Nonetheless, recent work demonstrated that local models still leak private information (Yu et al., 2019; Hitaj et al., 2017; Yeom et al., 2018; Melis et al., 2019; Zhang et al., 2017; Zeiler and Fergus, 2014; Li et al., 2013; Salem et al., 2018). This can be used by adversaries aiming to compromise the confidentiality of the model itself or that of the participants in training the model (Shokri et al., 2017; Yeom et al., 2018). The latter, is part of a more general class of attacks, known as Membership Inference Attacks (refer to as MIAs henceforth).
MIAs can have severe privacy consequences (Li et al., 2013; Salem et al., 2018) motivating a number of research works to focus on tackling them (Abadi et al., 2016; Mironov, 2017; Jayaraman and Evans, 2019). Predominantly, such mitigation approaches rely on differential privacy (Dwork et al., 2014; Erlingsson et al., 2014), whose improvement in privacy preservation comes with an adverse effect on the model’s prediction accuracy.
We observe, that edge devices are now increasingly equipped with a set of software and hardware security mechanisms powered by processor (CPU) designs offering strong isolation guarantees. System designs such as Arm TrustZone can enforce memory isolation between an untrusted part of the system operating in a Rich Execution Environment (REE), and a smaller trusted component operating in hardware-isolated Trusted Execution Environment (TEE), responsible for security critical operations. If we could efficiently execute sensitive DNNs inside the trusted execution environments of mobile devices, this would allow us to limit the attack surface of models without impairing their classification performance. Previous work has demonstrated promising results in this space; recent advancements allow for high-performance execution of sensitive operations within a TEE (Hunt et al., 2018; Hanzlik et al., 2018; Tople et al., 2018; Gu et al., 2018; Tramèr and Boneh, 2019). These works have almost exclusively experimented with integrating DNNs in cloud-like devices equipped with Intel Software Guard eXtensions (SGX). However, this paradigm does not translate well to edge computing due to significant differences in the following three factors: attack surface, protection goals, and computational performance. The attack surface on servers is exploited to steal a user’s private data, while the adversary on a user’s edge device focuses on compromising a model’s privacy. Consequently, the protection goal
in most works combining deep learning with TEEs on the server (e.g.,(Gu et al., 2018) and (Hunt et al., 2018)) is to preserve the privacy of a user’s data during inference, while the protection on edge devices preserves both the model privacy and the privacy of the data used in training this model. Lastly, edge devices (such as IoT sensors and actuators) have limited computational resources compared to cloud computing devices; hence we cannot merely use performance results derived on an SGX-enabled system on the server to extrapolate measurements for TEE-enabled embedded systems. In particular, blindly integrating a DNN in an edge device’s TEE might not be computationally practical or even possible. We need a systematic measurement of the effects of such designs on edge-like environments.
Since DNNs follow a layered architecture, this can be exploited to partition a DNN, having a sequence of layers executed in the untrusted part of the system while hiding the execution of sensitive layers in the trusted, secure environment. We utilize the TEE (i.e., Arm TrustZone) and perform a unique layer-wise analysis to illustrate the privacy repercussions of an adversary on relevant neural network models on edge devices with the corresponding performance effects. To the best of our knowledge, we are the first to embark on examining to what extent this is feasible on resource-constrained mobile devices. Specifically, we lay out the following research question:
RQ1: Is it practical to store and execute a sequence of sensitive DNN’s layers inside the TEE of an edge device?
To answer this question we design a framework, namely DarkneTZ, which enables an exhaustive layer by layer resource consumption analysis during the execution of a DNN model. DarkneTZ partitions a model into a set of non-sensitive layers ran within the system’s REE and a set of sensitive layers executed within the trusted TEE. We use DarkneTZ to measure, for a given DNN—we evaluate two small and six large image classification models—the underlying system’s CPU execution time, memory usage, and accurate power consumption for different layer partition choices. We demonstrate our prototype of DarkneTZ using the Open Portable TEE (OP-TEE)111https://www.op-tee.org/ software stack running on a Hikey 960 board.222https://www.96boards.org/product/hikey960/ OP-TEE is compatible with the mobile-popular Arm TrustZone-enabled hardware, while our choice of hardware closely resembles common edge devices’ capabilities (Ying et al., 2018; Park et al., 2019). Our results show that DarkneTZ only has 10% overhead when fully utilizing all available secure memory of the TEE for protecting a model’s layers.
These results illustrate that REE-TEE partitions of certain DNNs can be efficiently executed on resource constrained devices. Given this, we next ask the following question:
RQ2: Are such partitions useful to both effectively and efficiently tackle realistic attacks against DNNs on mobile devices?
To answer this question, we develop a threat model considering state of the art MIAs against DNNs. We implement the respective attacks and use DarkneTZ to measure their effectiveness (adversary’s success rate) for different model partition choices. We show that by hiding a single layer (the output layer) in the TEE of a resource-constrained edge device, the adversary’s success rate degrades to random guess while (a) the resource consumption overhead on the device is negligible (3%) and (b) the accuracy of inference remains intact. We also demonstrate the overhead of fully utilizing TrustZone for protecting models, and show that DarkneTZ can be an effective first step towards achieving hardware-based model privacy on edge devices.
Paper Organisation. The rest of the paper is organized as follows: Section 2 discusses background and related work and Section 3 presents the design and main components of DarkneTZ. Section 4 provides implementation details and describes our evaluation setup (our implementation is available online333https://github.com/mofanv/darknetz), while Section 5 presents our system performance and privacy evaluation results. Lastly, Section 6 discusses further performance and privacy implications that can be drawn from our systematic evaluation and we conclude on Section 7.
Model privacy risks. With successful training (i.e., the model converging to an optimal solution), a DNN model “memorizes” features of the input training data (Yeom et al., 2018; Radhakrishnan et al., 2018) (see (Zhang et al., 2019; LeCun et al., 2015)
for more details on deep learning), which it can then use to recognize unseen data exhibiting similar patterns. However, models have the tendency to include more specific information of the training dataset unrelated to the target patterns (i.e., the classes that the model aims to classify)(Yeom et al., 2018; Caruana et al., 2001).
Moreover, each layer of the model memorizes different information about the input. Yosinki et al. (Yosinski et al., 2014) found that the first layers (closer to the input) are more transferable to new datasets than the last layers. That is, the first layers learn more general information (e.g., ambient colors in images), while the last layers learn information that is more specific to the classification task (e.g., face identity). The memorization difference per layer has been verified both in convolutional layers (Zeiler and Fergus, 2014; Yosinski et al., 2015) and in generative models (Zhao et al., 2017). Evidently, an untrusted party with access to the model can leverage the memorized information to infer potentially sensitive properties about the input data which leads to severe privacy risks.
Membership inference attack (MIA). MIAs form a possible attack on devices which leverage memorized information on a models’ layers to determine whether a given data record was part of the model’s training dataset (Shokri et al., 2017). In a black-box MIA, the attacker leverages models’ outputs (e.g., confidence scores) and auxiliary information (e.g., public datasets or public prediction accuracy of the model) to train shadow models or classifiers without accessing internal information of the model (Shokri et al., 2017; Yeom et al., 2018). However, in a white-box MIA, the attacker utilizes the internal knowledge (i.e., gradients and activation of layers) of the model in addition to the model’s outputs to increase the effectiveness of the attack (Nasr et al., 2019). It is shown that the last layer (model output) has the highest membership information about the training data (Nasr et al., 2019). We consider a white-box adversary as our threat model, as DNNs are fully accessible after being transferred from the server to edge devices (Xu et al., 2019b). In addition to this, a white-box MIA is a stronger adversary than a black-box MIA, as the information the adversary has access to in a black-box attack is a subset of that used in a white-box attack.
Trusted execution environment (TEE). A TEE is a trusted component which runs in parallel with the untrusted Rich operating system Execution Environment (REE) and is designed to provide safeguards for ensuring the confidentiality and integrity of its data and programs. This is achieved by establishing an isolated region on the main processor, and both hardware and software approaches are utilized to isolate this region. The chip includes additional elements such as unchangeable private keys or secure bits during manufacturing, which helps ensure that untrusted parts of the platform (even privileged OS or hypervisor processes) cannot access TEE content (Costan and Devadas, 2016; Arm, 2009).
In addition to strong security guarantees, TEEs also provide better computational performance than existing software protections, making it suitable for computationally-expensive deep learning tasks. For example, advanced techniques such as fully homomorphic encryption enable operators to process the encrypted data and models without decryption during deep learning, but significantly increase the computation cost (Naehrig et al., 2011; Acar et al., 2018). Conversely, TEE protection only requires additional operations to build the trusted environment and the communication between trusted and untrusted parts, so its performance is comparable to normal executions in an untrusted environment (e.g., an OS).
Deep learning with TEEs. Previous work leveraged TEEs to protect deep learning models. Apart from the unique attack surface and thus protection goals we consider, these also differ with our approach in one more aspect: they depend on an underlying computer architecture which is more suitable for cloud environments. Recent work has suggested executing a complete deep learning model in a TEE (Costan and Devadas, 2016), where during training, users’ private data is transferred to the trusted environment using trusted paths. This prevents the host Cloud form eavesdropping on the data (Ohrimenko et al., 2016). Several other studies improved the efficiency of TEE-resident models using Graphics Processing Units (GPU) (Tramèr and Boneh, 2019), multiple memory blocks (Hunt et al., 2018), and high-performance ML frameworks (Hynes et al., 2018). More similar to our approach, Gu et al. (Gu et al., 2018) partitioned DNN models and only enclosed the first layers in an SGX-powered TEE to mitigate input information disclosures of real-time fed device user images. In contrast, membership inference attacks we consider, become more effective by accessing information in the last layers. All these works use an underlying architecture based on Intel’s SGX, which is not suitable for edge devices. Edge devices usually have chips designed using Reduced Instruction Set Computing (RISC), peripheral interfaces, and much lower computational resources (around 16 mebibytes (MiB) memory for TEE) (Ekberg et al., 2014). Arm’s TrustZone is the most widely used TEE implementation in edge devices. It involves a more comprehensive trusted environment, including the security extensions for the AXI system bus, processors, interrupt controller, TrustZone address space controller, etc. Camera or voice input connected to the APB peripheral bus can be controlled as a part of the trusted environment by the AXI-to-APB bridge. Utilizing TrustZone for on-device deep learning requires more developments and investigations because of its different features compared to SGX.
An effective method for reducing the memorization of private information of training data in a DNN model is to avoid overfitting via imposing constraints on the parameters and utilizing dropouts (Shokri et al., 2017). Differential Privacy (DP) can also obfuscate the parameters (e.g., adding Gaussian noise to the gradients) during training to control each input’s impact on them (Abadi et al., 2016; Yu et al., 2019). However, DP may negatively affect the utility (i.e., the prediction accuracy) if the noise is not carefully designed (Rahman et al., 2018). In order to obfuscate private information only, one could apply methods such as generative neural networks (Xu et al., 2019a) or adversarial examples (Jia et al., 2019) to craft noises for one particular data record (e.g., one image), but this requires additional computational resources which are already limited on edge devices.
Server-Client model partition. General information processed in the first layers (Yosinski et al., 2014) during forward propagation of deep learning often includes more important indicators for the inputs than those in the last layers (which is opposite to membership indicators), since reconstructing the updated gradients or activation of the first layers can directly reveal private information of the input (Aono et al., 2018; Dosovitskiy and Brox, 2016)
. Based on this, hybrid training models have been proposed which run several first layers at the client-side for feature extraction and then upload these features to the server-side for classification(Osia et al., 2020). Such partition approaches delegate parts of the computation from the servers to the clients, and thus, in these scenarios, striking a balance between privacy and performance is of paramount importance.
Gu et al. (Gu et al., 2018) follow a similar layer-wise method and leverage TEEs on the cloud to isolate the more private layers. Clients’ private data are encrypted and then fed into the cloud TEE so that the data and first several layers are protected. This method expands the clients’ trusted boundary to include the server’s TEE and utilizes an REE-TEE model partition at the server which does not significantly increase clients’ computation cost compared to running the first layers on client devices. To further increase training speed, it is also possible to transfer all linear layers outside a cloud’s TEE into an untrusted GPU (Tramèr and Boneh, 2019). All these partitioning approaches aim to prevent leakage of private information of users (to the server or others), yet do not prevent leakage from trained models when models are executed on the users’ edge devices.
We now describe DarkneTZ, a framework for preserving DNN models’ privacy on edge devices. We start with the threat model which we focus on in this paper.
We consider an adversary with full access to the REE of an edge device (e.g., the OS) on edge devices: this could be the actual user, malicious third-party software installed on the devices, or a malicious or compromised OS. We only trust the TEE of an edge device to guarantee the integrity and confidentiality of the data and software in it. In particular, we assume that a DNN model is pre-trained using private data from the server or other participating nodes. We assume the model providers can fully guarantee the model privacy during training on their servers by utilizing existing protection methods (Ohrimenko et al., 2016) or even by training the model offline, so the model can be secret provisioned to the user devices without other privacy issues.
DarkneTZ design aims at mitigating attacks on on-device models by protecting layers and the output of the model with low cost by utilizing an on-device TEE. It should be compatible with edge devices. That is, it should integrate with TEEs which can run on hardware technologies that can be found on commodity edge devices (e.g. Arm TrustZone), use standard TEE system architectures and corresponding APIs.
We propose DarkneTZ, illustrated in Figure 1, a framework that enables DNN layers to be partitioned as two parts to be deployed respectively into the REE and TEE of edge devices. DarkneTZ allows users to do inference with or fine-tuning of a model seamlessly—the partition is transparent to the user—while at the same time considers the privacy concerns of the model’s owner. Corresponding Client Application (CA) and Trusted Application (TA) perform the operations in REE and TEE, respectively. Without loss of generality, DarkneTZ’s CA runs layers to in the REE, while its TA runs layers to located in the TEE during fine-tuning or inference of a DNN. This DNN partitioning can help the server to mitigate several attacks such as MIAs (Nasr et al., 2019; Mo et al., 2019)
, as the last layers have a higher probability of leaking private information about training data (see Section2).
DarkneTZ expects sets of layers to be pre-provisioned in the TEE by the analyst (if the framework is used for offline measurements) or by the device OEM if a version of DarkneTZ is implemented on consumer devices. Note that in the latter case, secret provisioning of sensitive layers can also be performed over the air, which might be useful when the sensitive layer selection needs to be dynamically determined and provisioned to the edge device after supply. In this case, one could extend DarkneTZ to follow a variation of the SIGMA secure key exchange protocol (Krawczyk, 2003), modified to include remote attestation, similar to (Zhao et al., 2019). SIGMA is provably secure and efficient. It guarantees perfect forward secrecy for the session key (to defend against replay attacks) while its use of message authentication codes ensures server and client identity protection. Integrating remote attestation guarantees that the server provisions the model to a non-compromised edge device.
Once the model is provisioned, the CA requests the layers from devices (e.g., solid-state disk drive (SSD)) and invokes the TA. The CA will first build the DNN architecture and load the parameters of the model into normal memory (i.e., non-secure memory) to process all calculations and manipulations of the non-sensitive layers in the REE. When encountering (secretly provisioned) encrypted layers need to be executed in the TEE, which is determined by the model owner’s setting, the CA passes them to the TA. The TA decrypts these layers using a key that is securely stored in the TEE (using secure storage), and then it runs the more sensitive layers in the TEE’s secure memory. The secure memory is indicated by one additional address bit introduced to all memory system transactions (e.g., cache tags, memory, and peripherals) to block non-secure access (Arm, 2009). At this point, the model is ready for fine-tuning and inference.
The forward pass of both inference and fine-tuning passes the input to the DNN to produce activation of layers until the last layer, i.e., layer ’s activation is calculated by , where and are weights and biases of this layer, is activation of its previous layer and
is the non-linear activation function. Therefore, after the CA processes its inside layers fromto , it invokes a command to transfer the outputs (i.e., activation) of layer (i.e., the last layer in the CA) to the secure memory through a buffer (in shared memory). The TA switches to the forward_net_TA function corresponding to the invoked command to receive parameters (i.e., outputs/activation) of layer and processes the following forward pass of the network (from layer to layer ) in the TEE. In the end, outputs of the last layer are first normalized as to control the membership information leakage and are returned via shared memory as the prediction results.
The backward pass
of fine-tuning computes gradients of the loss functionwith respect to each weight and bias , and updates the parameters of all layers, and as and , where is a constant called the learning rate and is the desired output (i.e., called label). The TA can compute the gradient of the loss function by receiving from CA and back propagate it to the CA in order to update all the parameters. In the end, to save the fine-tuned model on devices, all layers in the TA are encrypted and transferred back to the CA.
We first use two popular DNNs, namely AlexNet and VGG-7, to measure the system’s performance. AlexNet has five convolutional layers (i.e., with kernel size 11, 5, 3, 3, and 3) followed by a fully-connected and a softmax layer, and VGG-7 has eight layers (i.e., seven convolutional layers with kernel size 3, followed by a fully-connected layer). Both AlexNet and VGG-7 use ReLU (Rectifier Linear Unit) activation functions for all convolutional layers. The number of neurons for AlexNet’s layers is 64, 192, 384, 256, and 256, while the number of neurons for VGG-7’s layers is 64, 64, 124, 124, 124, 124, and 124. We train the networks and conduct inference on CIFAR-100 and ImageNet Tiny. We use image classification datasets, as a recent empirical study shows that the majority of smartphone applications (70.6%) that use deep learning are for image processing(Xu et al., 2019b). Moreover, the state of the art MIA we are considering is demonstrated against such datasets (Nasr et al., 2019). CIFAR-100 includes 50k training and 10k test images of size belonging to 100 classes. ImageNet Tiny is a simplified ImageNet challenge that has 100k training and 10k test images of size belonging to 200 classes.
In addition to this, we use six available DNNs (Tiny Darknet (4 megabytes (MB)), Darknet Reference (28MB), Extraction (Szegedy et al., 2015) (90MB), Resnet-50 (He et al., 2016) (87MB), Densenet-201 (Huang et al., 2017) (66MB), and Darknet-53-448 (159MB)) pre-trained on the original ImageNet (Deng et al., 2009) dataset to measure DarkneTZ’s performance during inference. All pre-trained models can be found online444https://pjreddie.com/darknet/imagenet/. ImageNet has 1000 classes, and consequently, these DNN models’ last layers occupy larger memory that can exceed the TEE’s limits, compared to models with 100/200 classes. Therefore, for these six models, we only evaluate the condition that their last layer is in the TEE.
To evaluate the defence’s effectiveness against MIAs, we use the same models as those used in the demonstration of the attack(Nasr et al., 2019) (AlexNet, VGG-7, and ResNet-110). This ResNet with 110 depth is an existing network architecture that has three blocks (each has 36 convolutional layers) in the middle and another one convolutional layer at the beginning and one fully connected layer at the end (He et al., 2016)
. We use published models trained (with 164 epochs) on CIFAR-100(Krizhevsky et al., ) online555https://github.com/bearpaw/pytorch-classification. We also train three models on ImageNet Tiny666https://tiny-imagenet.herokuapp.com/ with 300 epochs as target models (i.e., victim models during attacks). Models with the highest valid accuracy are used after training. We follow (Nasr et al., 2019)’s methodology, and all training and test datasets are split to two parts with equal sizes randomly so that the MIA model learns both Member and Non-member
images. For example, 25K of training images and 5K of test CIFAR-100 images are chosen to train the MIA model, and then the model’s test precision and recall are evaluated using 5K of training images and 5K of test images in the rest of CIFAR-100 images.
We develop an implementation based on the Darknet (Redmon, 2013) DNN library. We chose this particular library because of its high computational performance and small library dependencies which fits within the limited secure memory of the TEE. We run the implementation on Open Portable TEE (OP-TEE), which provides the software (i.e., operating systems) for an REE and a TEE designed to run on top of Arm TrustZone-enabled hardware.
For TEE measurements, we focus on the performance of deep learning since secret provisioning only happens once for updating the model from severs. We implement 128-bit AES-GCM for on-device secure storage of sensitive layers. We test our implementation on a Hikey 960 board, a widely-used device (Ying et al., 2018; Akowuah et al., 2018; Dong et al., 2018; Brasser et al., 2019)
that is promising to be comparable with mobile phones (and other existing products) due to its Android open source project support. The board has four ARM Cortex-A73 cores and four ARM Cortex-A53 cores (pre-configured to 2362MHz and 533MHz, respectively, by the device OEM), 4GB LPDDR4 SDRAM, and provides 16MiB secure memory for trusted execution, which includes 14MiB for the TA and 2MiB for TEE run-time. Another 2MiB shared memory is allocated from non-secure memory. As the Hikey board adjusts the CPU frequency automatically according to the CPU temperature, we decrease and fix the frequency of Cortex A73 to 903MHz and keep the frequency of Cortex A53 as 533Mhz. During experiments we introduce a 120 seconds system sleep per trial to make sure that the CPU temperature begins underto avoid underclocking.
Edge devices suffer from limited computational resources, and as such, it is paramount to measure the efficiency of deep learning models when partitioned to be executed partly by the OS and partly by the TEE. In particular we monitor and report CPU execution time (in seconds), memory usage (in megabytes), and power consumption (in watts) when the complete model runs in the REE (i.e., OS) and compare it with different partitioning configurations where more sensitive layers are kept within the TEE. CPU execution time is the amount of time that the CPU was used for deep learning operations (i.e., fine-tuning or inference). Memory usage is the amount of the mapping that is currently resident in the main memory (RAM) occupied by our process for deep learning related operations. Power consumption is the electrical energy consumption per unit time that was required by the Hikey board.
More specifically, we utilized the REE’s /proc/self/status for accessing the process information to measure the CPU execution time and memory usage of our implementation. CPU execution time is the amount of time for which the CPU was used for processing instructions of software (as opposed to wall-clock time which includes input/output operations) and is further split into (a) time in user mode and (b) time in kernel mode. The REE kernel time captures together (1) the time spent by the REE’s kernel and (2) the time spent by the TEE (including both while in user mode and kernel mode). This kernel time gives us a direct perception of the overhead when including TEEs for deep learning versus using the same REE without a TEE’s involvement.
Memory usage is represented using resident set size (RSS) memory in the REE, but the memory occupied in the TEE is not counted by the RSS since the REE does not have access to gather memory usage information of the TEE. The TEE is designed to conceal this sensitive information (e.g., both CPU time and memory usage); otherwise, the confidentiality of TEE contents would be easily breached by utilizing side-channel attacks (Wang et al., 2017). To overcome this, we trigger an abort from the TEE after the process runs stably (memory usage tends to be fixed) to obtain the memory usage of the TEE.
To accurately measure the power consumption, we used Monsoon High Voltage Power Monitor,777https://www.msoon.com/ a high-precision power metering hardware capable of measuring the current consumed by a test device with a voltage range of 0.8V to 13.5V and up to 6A continuous current. We configured it to power the Hikey board using the required 12V voltage while recording the consumed current in a sampling rate.
We define the adversarial strategy in our setting based on state-of-the-art white-box MIAs which observe the behavior of all components of the DNN model (Nasr et al., 2019). White-box MIAs can achieve higher accuracy of distinguishing whether one input sample is presented in the private training dataset compared to black-box MIAs since the latter only have access to the models’ output (Yeom et al., 2018; Shokri et al., 2017). Besides, white-box MIAs are also highly possible in on-device deep learning, where a model user can not only observe the output, but also observe fine-grained information such as the values of the cost function, gradients, and activation of layers.
We evaluate the membership information exposure of a set of the target model’s layers by employing the white-box MIA (Nasr et al., 2019)
on these layers. The attacker feeds the target data to the model and leverages all possible information in the white-box setting including activation of all layers, model’s output, loss function, and the gradients of the loss function with respect to the parameter of each layer. It then separately analyses each information source by extracting features from the activation of each layer, the model’s output and the loss function via fully connected neural networks with one hidden layer, while using convolutional neural networks for the gradients. All extracted features are combined in a global feature vector that is later used as an input for an inference attack model. The attack model predicts a single value (i.e., Member or Non-member) that represents the membership information of the target data (we refer the interested readers to(Nasr et al., 2019) for a detailed description of this MIA). We use the test accuracy of the MIA model trained on a set of layers to represent the advantage of adversaries as well as the sensitivity of these layers.
To measure the privacy risk when part of the model is in TEE, we conduct this MIA on our target model in two different settings: (i) starting from the first layer, we add the later layers one by one until the end of the network, and (ii) starting from the last layer we add the previous layers one by one until the beginning of the network. However, the available information of one specific layer during the fine-tuning phase and that during the inference phase are different when starting from the first layers. Inference only has a forward propagation phase which computes the activation of each layer. During fine-tuning and because of the backward propagation, in addition to the activation, gradients of layers are also visible. In contrast to that, attacks starting from the last layers can observe the same information in both inference and fine-tuning since layers’ gradients can be calculated based on the cost function. Therefore, in setting (i), we utilize activation, gradients, and outputs. In setting (ii), we only use the activation of each layer to evaluate inference and use both activation and gradients to evaluate fine-tuning, since the outputs of the model (e.g., confidence scores) are not accessible in this setup.
In this Section we first evaluate the efficiency of DarkneTZ when protecting a set of layers in the TrustZone to answer RQ1. To evaluate system efficiency, we measure CPU execution time, memory usage, and power consumption of our implementation for both training and inference on AlexNet and VGG-7 trained on two datasets. We protect the last layers (starting from the output) since they are more vulnerable to attacks (e.g., MIAs) on models. The cost layer (i.e., the cost function) and the softmax layer are considered as a separate layer since they contain highly sensitive information (i.e., confidence scores and cost function). Starting from the last layer, we include the maximum number of layers that the TrustZone can hold. To answer RQ2, we use the MIA success rate, indicating the membership probability of target data (the more DarkneTZ limits this, the stronger the privacy guarantees are). We demonstrate the effect on performance and discuss the trade-off between performance and privacy using MIAs as one example.
As shown in Figure 2, the results indicate that including more layers in the TrustZone results in an increasing CPU time for deep learning operations, where the most expensive addition is to put the maximum number of layers. Figure 1(a) shows the CPU time when training AlexNet and VGG-7 with TrustZone on CIFAR-100 and ImageNet Tiny dataset, respectively. This increasing trend is significant and consistent for both datasets (CIFAR-100: ; . ImageNet Tiny: ; ). We also observe that protecting only the last layer in the TrustZone has negligible effect on the CPU utilization, while including more layers to fully utilize the TrustZone during training can increase CPU time (by 10%). For inference, the increasing trend is also significant (see Figure 1(b)). It only increases CPU time by around 3% when protecting only the last layer which can increase up to when the maximum possible number of layers is included in the TrustZone.
To further investigate the increasing CPU execution time effect, we analyzed all types of layers (both trainable and non-trainable) separately in the TrustZone. Trainable layers have parameters (e.g., weights and biases) that are updated (i.e., trainable) during the training phase. Fully connected layers and convolutional layers are trainable. Dropout, softmax, and maxpooling layers are non-trainable. As shown in Figure 3, different turning points exist where the CPU time significantly increases () compared to the previous configuration (i.e., one more layer is moved into the TrustZone) (Tukey HSD (Abdi and Williams, 2010) was used for the post hoc pairwise comparison). When conducting training, the turning points appear when putting the maxpooling layer in the TrustZone for AlexNet (see Figure 2(a)) and when putting the dropout layer and the maxpooling layer for VGG-7 (see Figure 2(b)). All these layers are non-trainable. When conducting inference, the turning points appear when including the convolutional layers in TrustZone for both AlexNet (see Figure 2(c)) and VGG-7 (see Figure 2(d)), which are one step behind those points when conducting training.
One possible reason for the increased CPU time during inference is that the TrustZone needs to conduct extra operations (e.g., related secure memory allocation) for the trainable layer, as shown in Figure 2(c) and Figure 2(d) where all increases happen when one trainable layer is included in the TrustZone. Since we only conduct one-time inference during experiments, the operations of invoking TEE libraries, creating the TA, and allocating secure memory for the first time significantly increased the execution time compared to the next operations. Every subsequent inference attempt (continuously without rebuilding the model) does not include additional CPU time overhead. Figure 4 also shows that most of the increased CPU execution time (from 0.1s to 0.6s) is observed in the kernel mode—which includes the execution in TrustZone. The operation that needs to create the TA (to restart the TEE and load TEE libraries from scratch), such as one-time inference, should be taken care of by preloading the TA before conducting inference in practical applications.
, the main reason for the increased CPU time is that protecting non-trainable layers in the TrustZone results in an additional transmission of their previous trainable layers from the REE to the TrustZone. Non-trainable layers (i.e., dropout and max-pooling layers) are processed using a trainable layer as the base, and the non-trainable operation manipulates its previous layer (i.e., the trainable layer) directly. To hide the non-trainable layer and to prevent its next layer from being transferred to the REE during backward propagation (as mentioned in Section3.4), we also move the previous convolutional layer to the TrustZone, which results in the turning points of the training are one layer in front of the turning points during inference. Therefore, in practical applications, we should protect the trainable layer and its previous non-trainable layer together, since only protecting the non-trainable layer still requires moving its trainable layer into TrustZone and does not reduce the cost.
Training with the TrustZone does not significantly influence the memory usage (in the REE) as it is similar to training without TrustZone (see Figure 4(a)). Inference with TrustZone uses less memory (in the REE) (see Figure 4(b)) but there is still no difference when more layers are placed into TrustZone. Memory usage (in the REE) decreases since layers are moved to TrustZone and occupy secure memory instead. We measure the TA’s memory usage using all mapping sizes in secure memory based on the TA’s abort information. The TA uses five memory regions for sizes of 0x1000, 0x101000, 0x1e000, 0xa03000, and 0x1000 which is in total for all configurations. The mapping size of secure memory is fixed when the TEE run-time allocates memory for the TA, and it does not influence when moving more layers into the memory. Therefore, because of the different model sizes, a good setting is to maximize the TA’s memory mapping size in TrustZone in order to hold several layers of a possible large model.
For training, the power consumption significantly decreases (p ¡ 0.001) when more layers are moved inside TrustZone (see Figure 4(c)). In contrast, the power consumption during inference significantly increases (p ¡ 0.001) as shown in Figure 4(d). In both training and inference settings, the trend of power consumption is likely related to the change of CPU time (see Figure 2). More specifically, trajectories of them in figures have the same turning points (i.e., decreases or increases when moving the same layer to the TEE). One reason for the increased power consumption during inference is the significant increase in the number of CPU executions for invoking the required TEE libraries that consume additional power. When a large number of low-power operations (e.g., memory operations for mapping areas) are involved, the power consumption (i.e., energy consumed per unit time) could be lower compared to when a few CPU-bound computationally-intensive operations are running. This might be one of the reasons behind the decreased power consumption during training.
System performance on large models. We also test the performance of DarkneTZ on several models trained on ImageNet when protecting the last layer only, including the softmax layer (or the pooling layer) and the cost layer in TrustZone, in order to hide confidence scores and the calculation of cost. The results show that the overhead of protecting large models is negligible (see Figure 6): increases in CPU time, memory usage, and power consumption are lower than for all models. Among these models, the smaller models (e.g., Tiny Darknet and Darknet Reference model) tend to have a higher rate of increase of CPU time compared to the larger models (e.g., Darknet-53 model), indicating that with larger models, the influence of TrustZone protection on resource consumption becomes relatively less.
System performance summary. In summary, it is practical to process a sequence of sensitive DNN model’s layers inside the TEE of a mobile device. Putting the last layer in the TrustZone does not increase CPU time and only slightly increases memory usage (by no more than 1%). The power consumption increase is also minor (no more than 0.5%) when fine-tuning the models. For inference, securing the last layer does not increase memory usage but increases CPU time and power consumption (by 3%). Including more layers to fully utilize the TrustZone during training can further increase CPU time (by 10%) but does not harm power consumption. One-time inference with multiple layers in the TrustZone still requires further development, such as utilizing preliminary load of the TA, in practical applications.
|Dataset||Model||Train Acc.||Test Acc.||Attack Pre.||Attack Pre. (DTZ)|
We conduct the white-box MIA (Section 4.3) on all target models (see Section 4.1 for the choice of models) to analyze the privacy risk while protecting several layers in the TrustZone. We used the standard precision and recall metrics, similar to previous works (Shokri et al., 2017). In our context, precision is the fraction of records that an attacker infers as being members, that are indeed members in the training set. Recall is the fraction of training records that had been identified correctly as members. The performance for both models and MIAs are shown in Table 1. Figure 7 shows the attack success precision and recall for different configurations of DarkneTZ. In each configuration, a different number of layers is protected by TrustZone before we launch the attack. The configurations with zero layers protected correspond to DarkneTZ being disabled (i.e., with our defense disabled). In particular, we measure the MIA adversary’s success following two main configuration settings of DarkneTZ. In the first setting, we incrementally add consecutive layers in the TrustZone starting from the front layers and moving to the last layers until the complete model is protected. In the second setting we do the opposite: we start from the last layer and keep adding previous layers in TrustZone for each configuration. Our results show that when protecting the first layers in TrustZone, the attack success precision does not change significantly. In contrast, hiding the last layers can significantly decrease the attack success precision, even when only a single layer (i.e., the last layer) is protected by TrustZone. The precision decreases to 50% (random guessing) no matter how accurate the attack is before the defense. For example, for the AlexNet model trained on CIFAR-100, the precision drops from 85% to 50% when we only protect the last layer in TrustZone. Precision is much higher than recall since the number of members in the adversary’s training set is larger than that of non-members, so the MIA model predicts member images better. The results also show that the membership information that leaks during inference and fine-tuning is very similar. Moreover, according to (Nasr et al., 2019) and (Shokri et al., 2017), the attack success precision is influenced by the size of the attackers’ training dataset. We used relatively large datasets (half of the target datasets) for training MIA models so that it is hard for the attacker to increase success precision significantly in our defense setting. Therefore, by hiding the last layer in TrustZone, the adversary’s attack precision degrades to 50% (random guess) while the overhead is under 3%.
We also evaluated the privacy risk when DarkneTZ protects the model’s outputs in TrustZone by normalizing it before outputting prediction results. In this configuration we conduct the white-box MIAs when all other layers (in the untrusted REE) are accessible by the adversary. This means that the cost function is protected, and the confidence score’s outputs are controlled by TrustZone. Three combinations of models and datasets, including AlexNet, VGG-7, and ResNet on CIFAR-100 are selected as they were identified as more vulnerable (i.e., with high attack precision see Table 1) to MIAs (Nasr et al., 2019). DarkneTZ is set to control the model’s outputs in three different ways: (a) top-1 class with its confidence score; (b) top-5 classes with their confidence scores; (c) all classes with their confidence scores. As shown in Figure 8 all three methods can significantly () decrease the attack success performance to around 50% (i.e., random guess). Therefore, we found that it is highly practical to use DarkneTZ to tackle MIAs: it incurs low resource consumption cost while achieving high privacy guarantees.
Effects of the model size. We showed that protecting large models with TrustZone tends to have a lower rate of increase of CPU execution time than protecting small models (see Figure 6). One possible explanation is that the last layer of a larger model uses a lower proportion of computational resources in the whole model compared to that of a smaller model. We have also examined the effect of different hardware: we executed our implementation of DarkneTZ with similar model sizes on a Raspberry Pi 3 Model B (RPi3B) and found it to have a lower rate of increase of cost (i.e., lower overhead) than when executed on the Hikey board (Mo et al., 2019). This is because the Hikey board has much faster processors optimized for matrix calculations, which renders additional operations of utilizing TrustZone more noticeable compared to other normal executions (e.g., deep learning operations) in the REE. Moreover, our results show that a typical configuration (16MiB secure memory) of the TrustZone is sufficient to hold at least the last layer of practical DNN models (e.g., trained on ImageNet). However, it is challenging to fit multiple layers of large models in a significantly smaller TEE. We tested a TEE with 5MiB secure memory on a Grapeboard888https://www.grapeboard.com/
: only 1,000 neurons (corresponding to 1,000 classes) in the output layer already occupy 4MiB memory when using floating-point arithmetic. In such environments, model compression, such as pruning(Han et al., 2015) and quantization (Wang et al., 2019; Jacob et al., 2018), could be one way to facilitate including more layers in the TEE. Lastly, we found that utilizing TEEs for protecting the last layer does not necessarily lead to resource consumption overhead, which deserves further investigation in future work. Overall, our results show that utilizing TrustZone to protect outputs of large DNN models is effective and highly efficient.
Extrapolating for other mobile-friendly models. We have used Tiny Darknet and Darknet Reference for testing DarkneTZ’s performance on mobile-friendly models (for ImageNet classification). Another widely-used DNNs on mobile devices, Squeezenet (Iandola et al., 2016) and Mobilenet (Howard et al., 2017), define new types of convolutional layers are not supported in Darknet framework currently. We expect these to have a similar privacy and TEE performance footprint because of the comparable size of model (4MB, 28MB, 4.8MB, 3.4MB for Tiny Darknet, Darknet Reference, Squeezenet, and Mobilenet, respectively), floating-point operations (980M, 810M, 837M, 579M), and model accuracy (58.7%, 61.1%, 59.1%, and 71.6% for Top-1)999https://github.com/albanie/convnet-burden and https://pjreddie.com/darknet/tiny-darknet/.
Improving performance. Modern mobile devices usually are equipped with GPU or specialized processors for deep learning such as NPU. Our current implementation only uses the CPU but can be extended to utilizing faster chips (i.e., GPU) by moving the first layers of the DNN that is always in the REE to these chips. By processing several layers of a DNN in a TEE (SGX) and transfer all linear layers to a GPU, Tramer et al. (Tramèr and Boneh, 2019) have obtained 4x to 11x increase for verifiable and private inference in terms of VGG16, MobileNet, and ResNet. For edge devices, another way for expediting the deep learning process is to utilize TrustZone’s AXI bus or peripheral bus, which also has an additional secure bit on the address. Accessing a GPU (or NPU) through the secure bus enables the TrustZone to control the GPU so that the confidentiality of DNN models on the GPU cannot be breached and achieve faster executions for partitioned deep learning on devices.
Defending against other adversaries. DarkneTZ is not only capable of defending MIAs by controlling information from outputs, but also capable of defending other types of attacks such as training-based model inversion attack (Fredrikson et al., 2015; Yang et al., 2019) or GAN attack (Hitaj et al., 2017) as they also highly depend on the model’s outputs. In addition to that, by controlling the output information during inference, DarkneTZ can provide different privacy settings depending on different privacy policies to servers correspondingly. For example, options included in our experiments are outputting Top-1 only with its confidence scores, outputting Top-5 with their ranks, or outputting all classes with their ranks which all achieve strong defense against MIAs. Recent research (Jia et al., 2019) also manipulates confidence scores (i.e., by adding noises) to defend against MIAs, but their protection can be broken easily if the noise addition process is visible to the adversaries for a compromised OS. DarkneTZ also protects layers while training models and conducting inference. The issue of private information leaked from layers’ gradients becomes more serious considering that DNN models’ gradients are shared and exchanged among devices in collaborated/federated learning. (Melis et al., 2019)’s work successfully shows private (e.g., membership) information about participants’ training data using their updated gradients. Recent research (Zhu et al., 2019) further reveals that it is possible to recover images and texts from gradients in pixel-level and token-level, respectively, and the last layers have a low loss for the recovery. By using DarkneTZ to limit information exposure of layers, this type of attack could be weakened.
Preserving model utility. By ”hiding” (instead of obfuscating) parts of a DNN model with TrustZone, DarkneTZ preserves a model’s privacy without reducing the utility of the model. Partitioning the DNN and moving its more sensitive part into an isolated TEE maintains its prediction accuracy, as no obfuscating technique (e.g., noise addition) is applied to the model. As one example of obfuscation, applying differential privacy can decrease the prediction accuracy of the model (Yu et al., 2019). Adding noises to a model with three layers trained on MNIST leads to the model accuracy drop by for small noise levels () and by for large noise levels () (Andrew et al., 2019; Abadi et al., 2016). The drop increases to around for large level noises when training on CIFAR-10 (Abadi et al., 2016). To obtain considerable accuracy when using differential privacy, one needs to train the model with more epochs, which is challenging for larger models since more computational resources are needed. In recent work, carefully crafted noise is added to confidence scores by applying adversarial examples (Jia et al., 2019). Compared to the inevitable decreasing utility of adding noise, DarkneTZ achieves a better trade-off between privacy and utility compared to differential privacy.
We demonstrated a technique to improve model privacy for a deployed, pre-trained DNN model using on-device Trusted Execution Environment (TrustZone). We applied the protection to individual sensitive layers of the model (i.e., the last layers), which encode a large amount of private information on training data with respect to Membership Inference Attacks. We analyzed the performance of our protection on two small models trained on the CIFAR-100 and ImageNet Tiny datasets, and six large models trained on the ImageNet dataset, during training and inference. Our evaluation indicates that, despite memory limitations, the proposed framework, DarkneTZ, is effective in improving models’ privacy at a relatively low performance cost. Using DarkneTZ adds a minor overhead of under for CPU time, memory usage, and power consumption for protecting the last layer, and of for fully utilizing a TEE’s available secure memory to protect the maximum number of layers (depending on the model size and configuration) that the TEE can hold. We believe that DarkneTZ is a step towards stronger privacy protection and high model utility, without significant overhead in local computing resources.
We acknowledge the constructive feedback from the anonymous reviewers. Katevas and Haddadi were partially supported by the EPSRC Databox and DADA grants (EP/N028260/1, EP/R03351X/1). This research was also funded by a gift from Huawei Technologies, a generous scholarship from the Chinese Scholarship Council, and a hardware gift from Arm.
Overfitting in neural nets: backpropagation, conjugate gradient, and early stopping. In Advances in Neural Information Processing Systems, pp. 402–408. Cited by: §2.1.
Downsampling leads to image memorization in convolutional autoencoders. arXiv preprint arXiv:1810.10333. Cited by: §2.1.