MLCapsule: Guarded Offline Deployment of Machine Learning as a Service

by   Lucjan Hanzlik, et al.

With the widespread use of machine learning (ML) techniques, ML as a service has become increasingly popular. In this setting, an ML model resides on a server and users can query the model with their data via an API. However, if the user's input is sensitive, sending it to the server is not an option. Equally, the service provider does not want to share the model by sending it to the client for protecting its intellectual property and pay-per-query business model. In this paper, we propose MLCapsule, a guarded offline deployment of machine learning as a service. MLCapsule executes the machine learning model locally on the user's client and therefore the data never leaves the client. Meanwhile, MLCapsule offers the service provider the same level of control and security of its model as the commonly used server-side execution. In addition, MLCapsule is applicable to offline applications that require local execution. Beyond protecting against direct model access, we demonstrate that MLCapsule allows for implementing defenses against advanced attacks on machine learning models such as model stealing/reverse engineering and membership inference.



There are no comments yet.


page 1

page 2

page 3

page 4


Stealing Machine Learning Models via Prediction APIs

Machine learning (ML) models may be deemed confidential due to their sen...

Predicting SLA Violations in Real Time using Online Machine Learning

Detecting faults and SLA violations in a timely manner is critical for t...

The TerraByte Client: providing access to terabytes of plant data

In this paper we demonstrate the TerraByte Client, a software to downloa...

Efficient and Secure Flash-based Gaming CAPTCH

With the growth of connectivity to smart grids, new applications, and th...

Inferring Tracker-Advertiser Relationships in the Online Advertising Ecosystem using Header Bidding

Online advertising relies on trackers and data brokers to show targeted ...

Parallelizing Machine Learning as a Service for the End-User

As ML applications are becoming ever more pervasive, fully-trained syste...

Machine Learning as a Service for HEP

Machine Learning (ML) will play significant role in success of the upcom...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Machine learning as a service (MLaaS) has become increasingly popular during the past five years. Leading Internet companies, such as Google,111 Amazon,222 and Microsoft333 have deployed their own MLaaS. It offers a convenient way for a service provider to deploy a machine learning (ML) model and equally an instant way for a user/client to make use of the model in various applications. Such setups range from image analysis over translation to applications in the business domain.

While MLaaS is convenient for the user, it also comes with several limitations. First, the user has to trust the service provider with the input data. Typically, there are no means of ensuring data privacy and recently proposed encryption mechanisms (BPTG15)

come at substantial computational overhead especially for state-of-the-art deep learning models containing millions of parameters. Moreover, MLaaS requires data transfer over the network which constitutes to high volume communication and provides new attack surface 

(MSCS18; OOSF18). This motivates us to come up with a client-side solution such that perfect data privacy and offline computation can be achieved.

As a consequence, this (seemingly) comes with a loss of control of the service provider, as the ML model has to be transfered and executed on the client’s machine. This raises concerns about revealing details of the model or granting unrestricted access to the user. The former damages the intellectual property of the service provider, while the latter breaks the commonly enforced pay-per-query business model. Moreover, there is a broad range of attack vectors on ML models that raise severe security and privacy risks 

(PMSW18). A series of recent papers have shown different attacks on MLaaS that can lead to reverse engineering (TZJRR16; OASF18) and training data leakage (FLJLPR14; FJR15; SSSS17; YGFJ18; SZHBFB19). Many of these threats are facilitated by repeated probing of the ML model that the service provider wants to protect against. Therefore, we need a mechanism to enforce that the service provider remains in control of the model access as well as provide ways to deploy defense mechanisms to protect the model.

1.1. Our Contributions

We propose MLCapsule, a guarded offline deployment of machine learning as a service. MLCapsule follows the popular MLaaS paradigm, but allows for client-side execution where model and computation remain secret. With MLCapsule, the service provider controls its ML model which allows for intellectual property protection and business model maintenance. Meanwhile, the user gains perfect data privacy and offline execution, as the data never leaves the client and the protocol is transparent

We assume that the client’s platform has access to an Isolated Execution Environment (IEE). MLCapsule uses it to provide a secure enclave to run an ML model classification. Moreover, since IEE provides means to prove execution of code, the service provider is assured that the secrets that it sends in encrypted form can only be decrypted by the enclave. This keeps this data secure from other processes running on the client’s platform.

To support security arguments about MLCapsule, we propose the first formal model for reasoning about the security of local ML model deployment. The leading idea of our model is a property called ML model secrecy. This definition ensures that the client can simulate MLCapsule using only a server-side API. In consequence, this means that if the client is able to perform an attack against MLCapsule, the same attack can be performed on the server-side API.

We also contribute by a proof-of-concept of our solution. Due to its simplicity and availability we implemented our prototype on a platform with Intel SGX, despite the fact that the current generation should not be used due to a devastating attack (see (BMWGKPSWYS18)). Note that our solution can be used on any IEE platform for which we can argue that it implements the abstract requirements defined in section LABEL:sec:secana.

In more details, in our solution we design so called MLCapsule layers, which encapsulate standard ML layers and are executed inside the IEE. Those layers are able to decrypt (unseal) the secret weight provisioned by the service provider and perform the computation in isolation. This modular approach makes it easy to combine layers and form large networks. For instance, we implement and evaluate the VGG-16 (SZ14) and MobileNet (HZCKWWAA17)neural networks. In addition, we provide an evaluation of convolution and dense layers and compare the execution time inside the IEE to a standard implementation.

The isolated code execution on the client’s platform renders MLCapsule ability to integrate advanced defense mechanism for attacks against machine learning models. For demonstration, we propose two defense mechanisms against reverse engineering (OASF18) and membership inference (SSSS17; SZHBFB19), respectively, and utilize a recent proposed defense (JSDMA18) for model stealing attacks (TZJRR16). We show that these mechanisms can be seamlessly incorporated into MLCapsule, with a negligible computation overhead, which further demonstrates the efficacy of our system.

1.2. Organization

Section 2 presents the requirements of MLCapsule. We provide the necessary technical background in Section 3 and LABEL:sec:related summarizes the related work in the field. In LABEL:sec:sysdesign, we present MLCapsule in detail and formally prove its security in LABEL:sec:secana. LABEL:sec:implement discusses the implementation and evaluation of MLCapsule. We show how to incorporate advanced defense mechanisms in LABEL:sec:advdef. LABEL:sec:discussion provides a discussion, and the paper is concluded in LABEL:sec:conclusion.

2. Requirements and Threat Model

In this section, we introduce security requirements we want to achieve in MLCapsule.

2.1. Model Secrecy and Data Privacy

User Side. MLCapsule deploys MLaaS locally. This provides strong privacy guarantees to a user, as her data never leaves her device. Meanwhile, executing machine learning prediction locally avoids the Internet communication between the user and the service provider. Therefore, possible attacks due to network communication (MSCS18; OOSF18) are automatically eliminated.

Server Side. Deploying a machine learning model on the client side naively, i.e., providing the trained model to the user as a white box, harms the service provider in the following two perspectives.

  • Intellectual Property.

    Training an effective machine learning model is challenging, the MLaaS provider needs to get suitable training data and spend a large amount of efforts for training the model and tuning various hyperparameters 

    (WG18). All these certainly belong to the intellectual property of the service provider and providing the trained model to the client as a white box will result in the service provider completely losing these valuable information. In this paper we consider the ML model architecture public and only consider the model parameters as private information. However, our approach can easily be extended to also protect the model architecture by using tools that protect the privacy of the code executed inside the IEE (e.g. using (BWZL18)).

  • Pay-per-query. Almost all MLaaS providers implement the pay-per-query business model. For instance, Google’s vision API charges 1.5 USD per 1,000 queries.444 Deploying a machine learning model at the client side naturally grants a user unlimited number of queries, which breaks the pay-per-query business model.

To mitigate all these potential damages to the service provider, MLCapsule needs to provide the following guarantees:

  • Protecting intellectual property

  • Enable the pay-per-query business model

In a more general way, we aim for a client-side deployment being indistinguishable from the current server-side deployment.

2.2. Protection against Advanced Attacks

Several recent works show that an adversary can perform multiple attacks against MLaaS by solely querying its API (black-box access). Attacks of such kind include model stealing (TZJRR16; WG18), reverse engineering (OASF18), and membership inference (SSSS17; SZHBFB19). These attacks are however orthogonal to the damages discussed in Section 2.1, as they only need black-box access to the ML model instead of white-box access. More importantly, it has been shown that the current MLaaS cannot prevent against these attacks neither (TZJRR16; SSSS17; OASF18; WG18).

We consider mitigating these threats as the requirements of MLCapsule as well. Therefore, we propose defense mechanisms against these advanced attacks and show that these mechanisms can be seamlessly integrated into MLCapsule.

3. Background

In this section, we focus on the properties of Intel’s IEE implementation called Software Guard Extensions (SGX) and recall a formal definition of Attested Execution proposed by Fisch et al. (FVBG17). We would like to stress that MLCapsule works with any IEE that implements this abstraction. We also formalize a public key encryption scheme, which we will use for the concrete instantiation of our system. We stress

3.1. Sgx

SGX is a set of commands included in Intel’s x86 processor design that allows to create isolated execution environments called enclaves. According to Intel’s threat model, enclaves are designed to trustworthily execute programs and handle secrets even if the host system is malicious and the system’s memory is untrusted.

Properties. There are three main properties of Intel SGX.

  • Isolation. Code and data inside the enclave’s protected memory cannot be read or modified by any external process. Enclaves are stored in a hardware guarded memory called Enclave Page Cache (EPC), which is currently limited to 128 MB with only 90 MB for the application. Untrusted applications can execute code inside the enclave using entry points called Enclave Interface Functions ECALLs, i.e., untrusted applications can use enclaves as external libraries that are defined by these call functions.

  • Sealing. Data stored in the host system is encrypted and authenticated using a hardware-resident key. Every SGX-enabled processor has a special key called Root Seal Key that can be used to derive a so called Seal Key which is specific to the identity of the enclave. This key can then be used to encrypt/decrypt data which can later be stored in untrusted memory. One important feature is that the same enclave can always recover the Seal Key if instantiated on the same platform, however it cannot be derived by other enclaves.

  • Attestation. Attestation provides an unforgeable report attesting to code, static data and meta data of an enclave, as well as the output of the performed computation. Attestation can be local and remote. In the first case, one enclave can derive a shared Report Key using the Root Seal Key and create a report consisting of a Message Authentication Code (MAC) over the input data. This report can be verified by a different enclave inside the same platform, since it can also derive the shared Report Key. In case of remote attestation, the actual report for the third party is generated by a so called Quoting Enclave that uses an anonymous group signature scheme (Intel Enhanced Privacy ID (BL10)) to sign the data.

Side-channel Attacks. Due to its design, Intel SGX is prone to side-channel attacks. This includes physical attacks (e.g., power analysis), yet successful attacks have not yet been demonstrated. On the other hand, several software attacks have been demonstrated in numerous papers (LSGKKP17; WCPZWBTG17; BMDKCS17). An attack specifically against secure ML implementations was presented by Hua et al. (BWZL18). Those kinds of attacks usually target flawed implementations and a knowledgeable programmer can write the code in a data-oblivious way, i.e., the software does not have memory access patterns or control flow branches that depend on secret data. In particular, those attacks are not inherent to SGX-like systems (CLD16). Recently, Bulck et al. (BMWGKPSWYS18) presented a devastating attack on SGX that compromises the whole system making the current generation of the SGX technology useless. Even though the current SGX generation should not be used in practice, future instantiations should provide a real-world implementation of the abstract security requirements needed to secure MLCapsule.

Rollback. The formal model described in the next subsection assumes that the state of the hardware is hidden from the users platform. SGX enclaves store encryptions of the enclave’s state in the untrusted part of the platform. Those encryptions are protected using a hardware-generated secret key, yet this data is provided to the enclave by an untrusted application. Therefore, SGX does not provide any guarantees about freshness of the state and is vulnerable to rollback attacks. Fortunately, there exist hardware solutions relying on counters (SP16) and distributed software-based strategies (MAKDSGJC17) that can be used to prevent rollback attacks.

3.2. Definition for SGX-like Hardware

There are many papers that discuss hardware security models in a formalized way. The general consensus is that those abstractions are useful to formally argue about the security of the system.

Barbosa et al. (BPSW16) define a generalized ideal interface to represent SGX-like systems that perform attested computation. A similar model was proposed by Fisch et al. (FVBG17) but was designed specifically to abstract Intel’s SGX and support local and remote attestation. Pass, Shi, and Tramèr (PST17) proposed an abstraction of attested execution in the universal composability (UC) model. In this paper we will focus on the formal hardware model by Fisch et al. (FVBG17). We decided to use this particular model because it was specifically defined to abstract the features that are supported by SGX which is the hardware used by our implementation. However, since we will only use remote attestation in our instantiation, we omit the local attestation part and refer the reader to the original paper for a full definition.

Informally, this ideal functionality allows a registered party to install a program inside an enclave, which can then be resumed on any given input. An instance of this enclave possesses internal memory that is hidden from the registering party. However, the main property of attested execution is that the enclave creates an attestation of execution. This attestation provides a proof for third parties that the program was executed on a given input yielding a particular output.

Formal Definition. We define a secure hardware as follows.

Definition 0 ().

A secure hardware functionality for a class of probabilistic polynomial time programs consists of the following interface: , , , , . has also an internal state that consists of a variable and a table consisting of enclave state tuples indexed by enclave handles. The variable will be used to store signing keys and table will be used to manage the state of the loaded enclave.

  • : given input security parameter , it generates the secret key and stores it in . It also generates and outputs public parameters .

  • : given input global parameters and program it first creates an enclave, loads , and then generates a handle that will be used to identify the enclave running . Finally, it sets and outputs .

  • : it runs at state on input and records the output . It sets to be the updated state of and outputs .

  • : executes a program in an enclave similar to but additionally outputs an attestation that can be publicly verified. The algorithm first executes on to get , and updates accordingly. Finally, it outputs the tuple : is the metadata associated with the enclave, is a program tag for and is a signature on .

  • : given the input global parameters and this algorithm outputs it the signature verification of succeeds. It outputs otherwise.

Correctness. A scheme is correct if the following holds. For all , all programs , all in the input domain of and all handles we have:

  • if there exist random coins (sampled in run time and used by ) such that , then

  • , where .

Remote attestation unforgeability is modeled by a game between a challenger and an adversary .

  1. provides an .

  2. runs in order to obtain public parameters , secret key and an initialization string . It gives to , and keeps and secret in the secure hardware.

  3. initialized a list .

  4. can run on any input of its choice and get back .

  5. can also run on input of its choice and get , where the challenger puts the tuple into .

  6. finally outputs .

wins the above game if and

. The hardware model is remote attestation unforgeable if no adversary can win this game with non-negligible probability.

3.3. Public Key Encryption

Definition 0 ().

Given a plaintext space we define a public key encryption scheme as a tuple of probabilistic polynomial time algorithms:

  • on input security parameters, this algorithm outputs the secret key and public key .

  • on input public key and message , this algorithm outputs a ciphertext .

  • on input secret key and ciphertext , this algorithm outputs message .

Correctness. A public key encryption scheme is correct if for all security parameters , all messages and all keypairs we have .

Ciphertext Indistinguishability Against CPA (Chosen Plaintext Attack).

[center] algocf[htbp]