Cloud storage services such as Amazon Web Services, Google Cloud Platform or Microsoft Azure have shown rapid adoption during the last years [seybert_internet_2014]. However, they lack in offering trustworthiness and confidentiality guarantees to end users. Although threats commonly originate from malicious adversaries breaching security measures to gain access to user data, the menace can also come from an employee with generous privileges, or curious governments warranting data collection for national interest. To overcome these issues, many approaches rely on the construction of cryptographic solutions in which the data is secured on the client side before reaching the storage premises [bessani2014scfs], therefore mitigating the lack of trust in the cloud provider.
To enable collaborative operations on the already secured data, one needs to enforce access control policies. Because of the untrusted nature of cloud storage, such administrative operations also need to be cryptographically protected. Enforcing cryptographic access control on an untrusted cloud storage context is subject to a number of requirements. First, access control schemes should incur as low traffic overhead as possible because cloud storages have slower response times in comparison to traditional storage mediums. Second, since a realistic and dynamic membership operations pattern [Garrison:2016:DACCloud] coupled with large volumes of users can make the system unusable in practice, the system must be limited to an acceptable computational bound. Third, as only privileged users perform access control operations, it is required that they gain zero knowledge to the data content to which the policy is applied. Last, identities normally employed by users when interacting with the cloud storage (i.e. existing credentials) should be sufficient for membership operations, hence avoiding complex trust establishment protocols.
A number of cryptographic constructions have been proposed for achieving access control. The simplest one, popularly referred to as HE, makes use of symmetric and public-key cryptography, by employing the former on the actual data and the latter on the symmetric key [goh2003sirius]. Other approaches rely on pairing-based cryptography as a substitute for public-key cryptography, and offer different levels of granularity for specifying access control policies. A few examples include: Identity Based Encryption (IBE) [boneh2001identity] which works similarly to public-key encryption at the identity level; Attribute-Based Encryption (ABE) [goyal2006attribute] that supports a richer tree-like access policy expressiveness; or IBBE [sakai2007identity] that can capture group-like policies. Similarly, Identity-based proxy re-encryption relies on a semi-trusted middle entity to whom users delegate the re-encryption rights [green2007identity].
Unfortunately, pairing-based constructions suffer from important performance issues. According to Garrison et al. [Garrison:2016:DACCloud], they are an order of magnitude slower than public-key cryptography. Remarkably, even Hybrid Encryption with Public Key (HE-PKI) incurs prohibitive costs for dynamic access control (see Figure 2). Moreover, the aforementioned constructions do not guarantee our zero knowledge requirement.
In this paper, we introduce a new cryptographic access control scheme that is both computationally- and storage-efficient considering a dynamic and large set of membership operations, while offering zero knowledge guarantees. Zero knowledge is guaranteed by executing the cryptographic access control membership operations in TEE Trusted Execution Environment (TEE).111Only membership operations rely on the TEE; user operations are done in a conventional execution environment. Our scheme is based on IBBE which is known to be flexible enough to produce small constant policy sizes. Its main drawback is its high impracticable computational cost. Our solution is to execute the membership operations of IBBE within the TEE so that we can make use of a master secret key. The TEE guarantees that this secret stays within the trusted computing boundary. We can therefore propose an optimization of a well-studied IBBE scheme [delerablee2007identity] that drastically reduces its computational complexity. A remaining issue is the computational complexity required for users to derive membership changes. To mitigate this last aspect, we propose a group partitioning mechanism such that the computational cost on the user-side is bound to a fixed constant partition size rather than the potential large group size.
We have implemented our new access control scheme using Intel SGX as TEE. To the best of our knowledge, we are the first to adapt the pairing-based-specific library PBC [lynn2006pbc] and its underlying dependency GMP [granlund1991gmp] to accurately run within SGX. Moreover, we deployed our system on a commercially-available public cloud storage. Our evaluation shows that our scheme only requires few resources while performing better than HE, in addition to providing zero knowledge.
To summarize, our contributions are the following:
We propose a new approach to IBBE by confiding in Intel SGX. To the best of our knowledge, this is the first effort seeking to lower the computational complexity bound of a well-studied IBBE scheme using TEEs as enabling technology. Additionally, our scheme only requires TEE support for a minimal set of users (i.e. administrators).
We instantiate the novel IBBE-SGX construction to an access control system hosted on a honest-but-curious cloud storage, proposing an original partitioning scheme that lowers the time required by users to ingest access control changes.
We fully implemented our original access control system and evaluated it against a realistic setup. We conducted extensive evaluations, showing that our system surpasses the performances obtained using state-of-the-art approaches.
Even though the main motivation for this work is to securely share data in a cloud environment, the proposed solution can be applied for encrypting arbitrary information that is securely broadcasted to a group of users over any shared media. Some other examples, besides cloud storages, are peer-to-peer networks or pay-per-view TV.
The rest of the paper is organized as follows. Section II presents the model assumptions undertaken within the problem context. We provide some background about Intel SGX and cryptographic schemes in Section III. Section IV presents the unique constructions that allow us to lower the complexity of an access control scheme by relying on TEE TEE. In the second part of that section, details about the partitioning mechanism are shown. We describe the design and implementation of an end-to-end system built on top of the scheme in section V. Section VI presents the evaluation of our solution by performing both micro- and macro-benchmarks. Section VII presents the related work in the fields of cryptography, access control systems and SGX. Finally, Section VIII concludes and presents future work.
In this study, we consider groups of users who perform collaborative editing on cryptographically-protected data stored on untrusted cloud storage systems. The data is protected using a block cipher encryption algorithm such as Advanced Encryption Standard (AES) making use of a symmetric group key gk. As illustrated in Figure 1, this work addresses the challenge of designing a system for group access control, in which the group key gk is cryptographically protected and derivable only by the members of the group. Because groups may become large with a significant turnover in their members, we investigate the implication of numerous member additions and revocations happening throughout their lifetimes.
We distinguish between two types of actors interacting within the system: administrators and users. All group membership operations are performed by administrators. Their duties include creating groups, and adding or revoking group members. The administrators manifest an honest-but-curious behavior, correctly serving work requests but with a possible malicious intent of discovering the group key gk. On the other hand, users listen to the cloud storage for group membership changes, and derive the new group key gk whenever it changes. Users are considered of having a trusted behavior.
The role of the cloud storage is to store the definitions of groups access control, also referred to as groups metadata, together with the list of members composing the group, and the actual group data. Besides being a storage medium, we also use the cloud storage as a broadcasting interface for group access control changes. Administrators are communicating with the cloud each time a group membership operation takes place so that users can be notified of the group membership update. We consider the cloud storage to show a similar behavior than administrators (i.e. honest-but-curious). It correctly services assigned tasks albeit with a possible malicious intent of peeking into the groups secrets. Moreover, when manifesting the curious behavior, the cloud storage could collude with any number of curious administrators or revoked users.
Size-wise, we target a solution that can accommodate groups of a very large-scale nature.222Our evaluation uses 1 million users as the largest group size. It is desired that administrators perform membership changes for multiple groups at a time, therefore the number of administrators is relatively small when compared to the number of users.
We choose to ensure authenticity guarantees only with respect to administrator identities, and therefore authenticate membership changes operations. Also, authenticating the group data created by users is out of scope, the current model being focused on confidentiality guarantees. Therefore, the notion of a reference monitor [sandhu1994access, Garrison:2016:DACCloud] on the cloud storage is not pertinent within our context. Finally, we do not consider hiding the identities of group members, nor the type of executed membership operations, as they can be inferred by the cloud storage from traffic access patterns. Privacy constructions offering such guarantees [mayberry2014efficient, devadas2016onion, apon2014verifiable] are orthogonal to our work.
The building blocks that lead to the creation of our access control extension are Intel SGX, Hybrid Encryption, and Identity Based Broadcast Encryption. This background section presents them together with open challenges.
Iii-a Intel SGX
Intel SGX is an instruction extension available on modern x86 CPUs manufactured by Intel. Similarly to ARM Trustzone [alves2004trust] or Sanctum [costan2016sanctum], SGX aims to shield code execution against attacks from privileged code (e.g. infected operating system) and certain physical attacks. A unit of code protected by SGX is called an enclave. Computations done inside the enclave cannot be seen from the outside [costan2016intel]. SGX seamlessly encrypts memory so that plaintext data is only present inside the CPU package. The assumption is that opening the CPU package is difficult for an attacker, and leaves clear evidence of the breach. Encrypted memory is provided in a processor-reserved memory area called the Enclave Page Cache (EPC), which is limited to 128 MB in the current version of SGX.
Intel provides a way for enclaves to attest each other [sgx-sdk]. After the attestation process, enclaves will be sure that each other is running the code that they are meant to execute. The attestation process can be extended to remote attestation that allows a piece of software running on a different machine to make sure that a given enclave is running on a genuine Intel SGX-capable CPU. An Intel-provided online service — the Intel Attestation Service (IAS) — is used to check the signature affixed to a quote created by the CPU [sgx-sdk]. As part of the attestation process, it is possible to provision the enclave with secrets. They will be securely transmitted to the enclave if and only if the remote attestation process succeeds.
The Trusted Computing Base (TCB) of an SGX enclave is composed of the CPU itself, and the code running within. The assumption is that we trust Intel for securely implementing SGX. Nevertheless, it has been shown that SGX is vulnerable to side-channel attacks . We consider this flaw to be orthogonal to our research, and hence do not consider it in our security evaluation.
Iii-B A Naive Approach to Group Access Control
Suppose that we want to come up with a simple, yet secure, cryptographic scheme to protect a group key gk. We can make use of an asymmetrical encryption primitive [ferguson2003practical], based on RSA or Elliptic Curve Cryptography (ECC). As each user in the system possesses a public-private key pair, the scheme consists in encrypting gk using the public key of each member in the group. A member of the group can then deduce gk by decrypting the resulting ciphertext using her private key. This construction is sometimes referred to as Hybrid Encryption (HE) [Garrison:2016:DACCloud], or Trivial Broadcast Encryption Scheme [stinson2005cryptography].
To achieve the zero knowledge requirement, administrators could be asked to run HE within an SGX enclave, thus protecting the discovery of gk. However, before discussing the cost of such an integration between HE and SGX, we point out a number of prior weaknesses of HE.
First, the amount of group metadata grows linearly with the number of members in the group, making it impractical in the context of very large groups. Second, when revoking group members, a new key gk needs to be created; the entire group metadata also needs to be generated again by encrypting the latest value of gk. As the group size increases, the computational cost of the scheme grows linearly. Likewise, the latency incurred for putting, getting and storing the group metadata on the cloud storage will also seriously expand.
Furthermore, when performing group membership operations, the administrators need to entrust the authenticity of the public keys linked to the identity of the members. Public Key Infrastructures [ferguson2003practical] can be used to solve this issue. Besides the trust risks that the PKI brings [ellison2000ten], one needs to account for the practical costs of setting up, running and accessing PKI PKI. To mitigate these risks, one could choose to substitute public-key primitives with identity-based ones. Identity Based Encryption (IBE) [boneh2001identity, waters2005efficient] makes use of arbitrary strings as public keys; we can therefore use a user name directly as a public key. The user secret key is generated at setup phase or later by TA Trusted Authority (TA). Obviously, both HE-PKI and Hybrid Encryption with Identity-based Encryption (HE-IBE) have the same inner functioning, when making abstraction of the key methodology choice.
Integrating SGX with HE-PKI and HE-IBE is required in order to guarantee the zero knowledge property against administrators. As hybrid encryption is causing a high group metadata expansion, it has a direct impact on the memory space that is used inside the SGX enclave. Accessing memory in SGX enclaves can induce an overhead of up to 19.5 % for write accesses and up to 102 % for read accesses [weisse2017regaining]. Apprehensive about the hypothesized SGX degradation in performance caused by the group metadata expansion, we shift the focus on finding a solution with minimal expansion.
Iii-C Broadcast Encryption (Be) and Identity-Based Be
In order to optimize both SGX and cloud transit costs, we investigate the possibility of cryptographic schemes that induce a minimal group expansion.
Broadcast Encryption (BE) [fiat1993broadcast] is a public-key cryptosystem with a unique public key that envelopes the entire system, contrary to the HE scheme where each user uses a different public key. However, each user in a BE system has a unique private key generated by a trusted authority. To randomly generate a group key gk and the associated group metadata (named encrypt operation within BE systems), one makes use of the system-wide public key. On the other side, when a user wants to reveal gk (decrypt in BE systems), she makes use of her individual private key.
As broadcast encryption schemes come with different contextual models, we impose a number of conditions. First, to maintain the same threat model as HE, we are only investigating the use of fully collusion-resistant BE schemes [boneh2005collusion], in which no coalition of members outside of the group could reveal gk. Second, the set of users participating in the system is not initially known, thus we rely on the usage of dynamic BE schemes [delerablee2007fully]. Third, as in the case of HE, we would prefer constructions that can accommodate the use of IBE.
Piercing through the existing research literature, we identified IBBE IBBE scheme [Delerablee:2007:IBBE] that not only fulfills all the aforementioned requirements, but also operates with group metadata expansions and user private keys of constant sizes. Moreover, the scheme has an additional strategic advantage that proves beneficial in our context: the system-wide public key size is linear in the maximal size of any group.
Upon analyzing the computational complexity of the selected IBBE scheme, one can notice that creating gk given a set of members, as well as decrypting it as a user, are operations with a quadratic complexity in the number of members. Therefore, even though the scheme brings a tremendous gain in the size of group metadata expansion, the computational cost of IBBE might be excessive for practical use.
Figure 2 exemplifies the performance of HE-PKI, HE-IBE and IBBE schemes in their raw form, before any integration with SGX is considered. The sub-figure on the left displays the total time taken for the operation of creating a group while the one on the right shows the size occupied by the expansion of group metadata. The optimality of IBBE regarding the size of group metadata expansion is immediately obvious. It always produces 256 bytes of metadata, regardless of the number of users per group. That is preferable compared to HE-PKI and HE-IBE, which produce increasingly larger values, as much as 27 MB for groups of 100,000 users, and 274 MB for the largest benchmarked group size. On the other hand, IBBE performs much worse than HE-PKI when considering the execution time. It is 150 and 144 slower for groups of 10,000 and 100,000 users, respectively.
There is no doubt that running the IBBE scheme in this form is inadequate. In the remainder of this paper, we describe two original contributions, one that changes a traditional assumption of the IBBE scheme, and a second that lowers the user decryption time.
IBBE-SGX can be broadly described in 3 steps: (i) trust establishment and private key provisioning; (ii) membership definitions and group key provisioning; and (iii) membership changes and key updates.
Iv-a Trust Establishment
IBBE schemes generate a single public key that can be paired with several private keys, one per user. Users, in turn, need to be sure that the private key they receive is indeed generated by someone they trust, otherwise they would be vulnerable to malicious entities trying to impersonate the key issuer. To achieve that, we rely upon PKI PKI to provide verifiable private keys to users.
Another security requirement of IBBE-SGX is that the key management must be kept in TEE TEE. Therefore, there must be a way of checking whether that is the case. On that front, Intel SGX makes it possible to attest enclaves. Running this procedure gives the assurance that a given piece of binary code is truly the one running within an enclave, on a genuine Intel SGX-capable processor (Section III-A).
Figure 3 illustrates the initial setup of trust that must be executed at least once before any key leaves the enclave. Initially, the enclaved code generates a pair of asymmetric keys. While the private one never leaves the trusted domain, the public key is sent along with the enclave measurement to the Auditor (1), who is both responsible for attesting the enclave and signing its certificate, thus also acting as a Certificate Authority (CA). Next, the Auditor checks with IAS (2) if the enclave is genuine. Being the case, it compares the enclave measurement with the expected one, so that it can be sure that the code inside the shielded execution context is trustworthy. Once that is achieved, the CA issues the enclave’s certificate (3), which also contains its public key. Finally, users are able to receive their private keys and the enclave’s certificate (4). The key will be encrypted by the enclave’s private key generated in the beginning. To be sure they are not communicating with rogue key issuers, users check the signature in the certificate and then use the enclave’s public key contained within. All communication channels described in this scheme must be encrypted by cryptographic protocols such as Transport Layer Security (TLS).
Iv-B From IBBE to IBBE-SGX
Traditionally, the IBBE scheme [sakai2007identity, Delerablee:2007:IBBE] consists of the following four operations.
Iv-B1 System Setup
The system setup operation is run once by a Trusted Authority (TA) and generates a Master Secret Key and a system-wide Public Key .
Iv-B2 Extract User Secret
The TA then uses the Master Secret Key to extract the secret key for each user .
Iv-B3 Encrypt Broadcast Key
The broadcaster generates a randomized Broadcast Key for a given set of receivers , by making use of . Together with , the operation outputs a public broadcast ciphertext . The broadcast ciphertext can be publicly sent to members of so they can derive .
Iv-B4 Decrypt Broadcast Key
Any member of can discover by performing the decrypt broadcast key operation given her secret key and .
In contrast to traditional IBBE that requires the use of TA TA to perform the System Setup and Extract User Secret operations, we rely on SGX enclaves. Therefore, the master secret key MSK used by the two aforementioned operations can be made available in plaintext exclusively inside the enclave, and securely sealed if stored outside of the enclave for persistence reasons.
Similarly to IBBE, the Encrypt Broadcast Key and Decrypt Broadcast Key operations rely on the system public key , and are thus usable by any user of the system.
As opposed to the traditional IBBE usage scenario, our model requires that all group membership changes—generating the group key and metadata—are performed by an administrator. Administrators can use the master secret key MSK to encrypt, set up the system and extract user keys. The decryption operation, however, remains identical to the traditional IBBE approach, being executed by any arbitrary user. In the remainder of this paper, we refer to our new IBBE scheme as IBBE-SGX.
We now describe the computational simplification opportunities introduced by IBBE-SGX compared to IBBE [Delerablee:2007:IBBE]. First, by making use of MSK inside the enclave, the complexity of the encryption operation drops from for IBBE to for IBBE-SGX, where is the number of users in the broadcast group set. The reason behind the complexity drop is bypassing a polynomial expansion of quadratic cost, necessary in the traditional IBBE assumptions. The reader is directed to Section A-C for the concrete mathematical inference process. We argue that this complexity cut is sufficient to tackle the impracticality of the IBBE scheme emphasized earlier in Figure 2. Second, by relying on MSK, one can build efficient access control specific operations, such as adding or removing a user from a broadcast group. IBBE-SGX can accommodate complexities for both operations, as illustrated in Sections A-E and A-F.
Unfortunately, IBBE-SGX maintains an complexity for the user decrypt operation, during which, similarly to IBBE encryption, the algorithm performs a polynomial expansion of quadratic cost. We address this drawback by introducing a partitioning mechanism as described later in Section IV-C.
Finally, we consider a re-keying operation, for optimally generating a new broadcast key and metadata when the identities of users in the group do not change. The operation can be performed in complexity for both IBBE and IBBE-SGX, as detailed in Section A-G.
Iv-C Partitioning Mechanism for IBBE-SGX
Although IBBE-SGX produces a minimal metadata expansion and offers an optimal cost for group membership operations, it suffers from a prohibitive cost when a member needs to decrypt the broadcast key. To address this issue, we introduce a partitioning mechanism.
As the decryption time is bound to the number of users in the receiving set, we split the group into partitions (sub-groups) and therefore limit the user decryption time to the number of members in a single partition. Moreover, each partition broadcast key will wrap the prime group key gk, so that members of different partitions can communicate by making use of gk.
The partition mechanism is depicted in Fig 4. The first step consists in splitting the group of users in fixed-size partitions. The administrator can then use the encrypt functionality of IBBE-SGX to generate a sub-group broadcast key and ciphertext for each partition . Next, for each partition, the group key gk is encrypted using symmetric encryption such as AES, by using the partition broadcast key as the symmetric encryption key. Note that since the scheme is executing inside an SGX enclave, a curious administrator cannot observe gk nor the broadcast keys.
The group metadata of IBBE-SGX is therefore represented by the set of all pairs composed of the partition ciphertext and the encrypted group key (i.e. (, ) in Figure 4). The inquisitive cloud storage can then publicly receive and store this set of group metadata.
Whenever a membership change happens, the administrator will update the list of group members and send the affected partition metadata to the cloud. The clients, in turn, can detect a change in their group by listening to updates in their partition metadata.
The partitioning mechanism has an impact on the computational complexity of the IBBE-SGX scheme on the administrator side. First, as the public key of the IBBE system is linear in the maximal number of users in a group [Delerablee:2007:IBBE], results that the public key for the IBBE-SGX scheme is linear in the maximal number of users in a partition (denoted by ). Therefore, both the computational complexity and storage footprint of the system setup phase can be reduced by a factor representing the maximal number of partitions, without losing any security guarantee. Second, the complexities of IBBE-SGX operations change to accommodate the partitioning mechanism, as shown in Table I. Creating a group becomes the cost of creating as many IBBE-SGX partitions that the fixed partition size dictates. Adding a user to a group remains constant, as the new user can be added either to an existing partition or to a brand new one. Removing a user implies performing a constant time re-keying for each partition. Finally, the decryption operation gains by being quadratic in the number of users of the partition rather than the whole group.
|Extract User Key|
|Create Group Key|
|Add User to Group|
|Remove User from Group|
|Decrypt Group Key|
The partitioning mechanism also has an impact on the storage footprint for group metadata. Compared to IBBE when considering a single partition, the footprint is augmented by the symmetrically encrypted partition broadcast key (i.e. ) and the nonce required for this symmetric encryption. When considering an entire group, the cost of storing the group metadata is represented by the cost of a single partition multiplied by the number of partitions in the group, in addition to a metadata structure that keeps the mapping between users and partitions.
Although the partition mechanism induces a slight overhead, the number of partitions in a group is relatively small compared to the group size. Second, partition metadata are only manipulated by administrators, so they can locally cache it and thus bypass the cost of accessing the cloud for metadata structures. Third, as our model accepts that the identities of group members can be discovered by the curious administrator or the cloud, there is no cryptographic operation needed to protect the mappings within the partition metadata structure.
Determining the optimal value for the partition size mainly depends on the dynamics of the group. Indeed, there is a trade-off between the number and frequency of operations performed by the administrator for group membership and those performed by regular users for decrypting the broadcast key. A small partition size reduces the decryption time on the user side while a larger partition size reduces the number of operations performed by the administrator to run IBBE-SGX and to maintain the metadata.
V IBBE-SGX Group Access Control System
We describe in this section the design and implementation of an end-to-end group access control system based on IBBE-SGX. The overall architecture is illustrated in Figure 5 and consists of a client and an administrator using Dropbox as a public cloud storage provider.
V-a System Design
The administrator’s Application Programming Interface (API) makes calls to the underlying SGX enclave that hold the functionalities of IBBE-SGX which is built on top of an IBBE component. Since SGX is not required on the client side, the Client API directly calls the functionalities of the IBBE component. Both administrators and clients make use of local in-memory caches in order to save round-trips to the cloud for accessing existing access policies. Administrators make use of the PUT HTTP verb to send data to the cloud, while clients are listening by using HTTP long polling. In Dropbox, long polling works at the directory level, so we index the group metadata as a bi-level hierarchy. The parent folder represents the group, and each child stands for a partition.
The operation for creating a group is described in Algorithm 1. Once the fixed-size partitions are determined (line 1), the execution enters the SGX enclave (lines 2 to 6) during which the random group key is enveloped by the hash of each partition broadcast key. The ciphertext values, as well as the sealed group key, leave the enclave to be later pushed to the cache and the cloud (line 7).
The operation of adding a user to a group (Algorithm 2) starts by finding the set of all partitions with remaining capacity (line 1). If no such a partition is found, a new partition is created for the user (line 3) and the group key is enveloped by the broadcast key of the new partition (lines 4 to 6), before persisting its ciphertexts (line 7). Otherwise, a partition that is not empty is randomly picked, and the user is added to it (lines 9, 10). Since the partition broadcast key remains unchanged, only the ciphertext needs to be adapted to include the new user (line 10). The partition members and ciphertext are then updated on the cloud (line 12). Note that there is no need to push the encrypted group key as it was not changed.
Removing a user from a group (Algorithm 3) proceeds by removing the user from her hosting partition (lines 1 and 2). Next, a new group key is randomly generated (line 3). The former user hosting partition broadcast key and ciphertext are changed to reflect the user removal (line 4) and then used for enveloping the new group key (line 5). For all the remaining partitions, a constant time re-keying regenerates the partition broadcast key and ciphertext that envelopes the new group key (lines 6 to 9). After sealing the new group key (line 10), the changes of metadata for the group partitions are pushed to the cloud (line 11). Note that the partition members only need to be updated for the removed user hosting partition.
As many removal operations can result in partially unoccupied partitions, we propose the use of a re-partitioning scheme whenever the partition occupancies are too low. We implement a heuristic to detect a low occupancy factor such that if less than half of the partitions are only two thirds full, then re-partitioning is triggered. Re-partitioning consists in simply re-creating the group following Algorithm1.
Finally, the client decrypt operation works by first using IBBE to decrypt the broadcast key and then use the hash of this key for an AES decryption to obtain the group key. Due to space constraints, we omit the formal algorithm specification.
In order to implement the system, we used the PBC [lynn2006pbc] pairing-based cryptography library which, in turn, depends on GMP [granlund1991gmp] to perform arbitrary precision arithmetics. They both have to be used inside SGX enclaves (Section IV). There are several challenges when porting legacy code to run inside enclaves. Besides having severe memory limitations (Section III-A), it also considers privileged code running in any protection ring but user-mode (ring 3) as not trusted. Therefore, enclaves cannot call operating system routines.
Although memory limitations can have performance implications at runtime, they have little influence on enclave code porting. Calls to the operating system, on the other hand, can render this task very complex or even unfeasible. Luckily, since both PBC and GMP mostly perform computations rather than input and output operations, the challenges on adapting them were chiefly restrained to tracking and adapting calls to glibc. The adaptations needed were done either by relaying operations to the operating system through outside calls (ocalls), or performing them with enclaved equivalents. The outside calls, however, do not perform any sensitive action that could compromise security. Aside from source code modifications, we dedicated efforts to adapt the compilation toolchain. This happens because one has to use curated versions of standard libraries (like the ones provided by Intel SGX SDK), besides having to prevent the use of compiler’s built-in functions and setting some other code generation flags. The total number of Lines of Code or compilation toolchain files that were modified were 32 lines for PBC and 299 for GMP.
Apart from changes imposed by SGX, we also needed to use common cryptographic libraries. Although some functions are provided in v.1.9 of the Intel SGX SDK [sgx-sdk], its AES implementation is limited to 128 bits. Since we aim at the maximal security level, we used the AES 256 bits implementation provided in Intel’s port of OpenSSL [intel-sgx-ssl]. The end-to-end system encapsulating both IBBE-SGX and HE schemes consists in 3,152 lines of C/C++ code and 170 lines of Python.
In this section, we benchmark the performance of the IBBE-SGX scheme from three different perspectives: by measuring the operations performance in isolation, then by comparing them to Hybrid Encryption (HE), and finally by capturing the performance when replaying realistic and generated access control traces. We chose to compare IBBE-SGX to HE only as the latter already shows better computational complexity than IBBE (see Figure (a)a).
The experiments were performed on a quad-core Intel i7-6600U machine, having a processor at 3.4 GHz with 16 GB of RAM, using Ubuntu 16.04 LTS.
Within the microbenchmarks we isolate the performance of each IBBE-SGX operation, and perform a comparison with the HE scheme.
First, we evaluate the performance of the bootstrap phase. It consists on setting up the system and generating secret user keys, referenced in Figure 6. One can notice that the setup phase latency increases linearly per partition size, with a growth of 1.2s per 1,000 users. In contrast, extracting secret user keys gives an average throughput of 764 operations per second, independent of the partition size.
Next, we evaluate the behavior of IBBE-SGX operations compared to HE. Figure (a)a displays the computational cost for operations of creating a group, removing a user from a group, and the storage footprint of the group metadata. One can notice that all three operations are better than their HE counterparts by approximately a constant factor. The computational cost of create and remove operations of IBBE-SGX is on average 1.2 orders of magnitude faster than HE. Compared to the original IBBE scheme, IBBE-SGX is better by 2.4 orders of magnitude for groups of 1,000 users and 3.9 orders of magnitude for one million users (see Figures (a)a and (a)a). Storage-wise, IBBE-SGX is up to 6 orders of magnitude better than HE. Moreover, Figure (b)b zooms into the performances of IBBE-SGX create and remove operations, and the storage footprint respectively, when considering different sizes of partitions. One can notice that the remove operation takes half the time than the create operation. Considering the storage footprint, the degradation brought by using smaller partition sizes is fairly small (e.g., 432 vs. 128 bytes for groups of 1 million members).
The Cumulative Density Function (CDF) of latencies for adding a user to a group is shown in Figure (a)a. The operation has a constant time complexity for both IBBE-SGX and HE. As the add operation of IBBE-SGX can take two paths, either adding a user to an existing partition or creating a new one if all the others are full, the plot points the difference between the two at the CDF value of 0.8. Moreover, the HE add operation is generally twice as fast as IBBE-SGX.
The client decryption performance is shown in Figure (b)b. The decryption operation, like the add operation, is faster within the HE approach than IBBE-SGX. The difference of 2 orders of magnitude is caused by the quadratic cost of the IBBE-SGX decryption operation. We argue that a slower decryption time for IBBE-SGX can be acceptable in practice. First, the decrypt performance is overshadowed by the slow cloud response time necessary for clients to update the group metadata that always precedes a decryption operation. Second, the cost of decryption remains bounded to a partition size, independent on the number of users in the group.
Vi-B1 Real Data Set
To capture the performance of the IBBE-SGX scheme within a realistic scenario, we decide to replay an access control trace based on the membership changes in the version control repository of the Linux Kernel [_linux_dataset].
We derive the membership trace by considering the first commit of a user as the add to group operation. The remove from group operation is represented by the user’s last commit. The generated trace contains 43,468 membership operations that spawn across a period of 10 years, during which the group size never exceeds 2803 users. We replay the generated trace sequentially for both HE and IBBE-SGX by varying the partition size. We also capture the total time spent by the administrator to replay the trace and the average user decryption time.
displays the results. Considering the administrator replay time, IBBE-SGX performs better when the partition size converges to the number of users in the group. Using a small partition size, e.g. 250, is almost twice as inefficient when compared to a partition of 1000 users. Compared to HE replay time, IBBE-SGX is generally 1 order of magnitude faster. On the other hand, decryption time for IBBE-SGX grows quadratically per partition size while in HE it remains constant. This evidentiates IBBE-SGX’s trade-off caused by different partition sizes on the performances of membership changes and user decryption time. A prior estimation of the maximal group size (2803 in our case) suggests the choice of a small partition for practical use (such as 750), so that it can manifest satisfactory outcomes both in terms of admininistrator performance and user decryption time.
Vi-B2 Synthetic Data Set
In order to observe the impact of different workloads of group membership access control, we generate a set of synthetic traces that capture incremental percentages of revocation rates. Concretely, we generate 11 different traces consisting of 10,000 membership operations. The composition of the traces is randomly generated by considering different revocation rates. We replay the 11 traces through our system and measure the end-to-end time required by the administrator to perform all membership changes. We then repeat the process by considering different partition sizes.
The results are shown in Figure 10. We observe a linear increase in the total time when incrementally increasing the revocation ratio up to 50% in workloads dominated by add operations. After this point, the total time stabilizes and finally decreases when the revocation ratio is more than 90%. This behavior is caused by the merging of sparse partitions, which happens more frequently with the increasing rate of revocations. Having fewer partitions, IBBE-SGX’s operations become faster, therefore decreasing the total time.
Vii Related Work
We structure the presentation of the related work on three axes detailing first research work on cryptographic schemes used for access control. Then we go into research work of systems that cryptographically protect from untrusted storages. Last, we detail related work regarding Intel SGX.
Vii-a Cryptographic Schemes for Access Control
HE making use of a PKI and IBE has been utilized within a role based access control and proved unsuitable in the cloud storage context [Garrison:2016:DACCloud].
ABE [sahai2005fuzzy] is a cryptographic construction that allows a fine-grained access control by matching attributes labeled to both users and content. Depending on the labeled location, one can distinguish between key-policy ABE [goyal2006attribute] and ciphertext-policy ABE [bethencourt2007ciphertext]. However, when employed for simple access control policies, such as our group sharing context, ABE has substantially greater costs than identity-based encryption [Garrison:2016:DACCloud].
Hierarchical Identity Based Encryption (HIBE) [boneh2005hierarchical] and Functional Encryption (FE) [boneh2011functional] are two cryptographic schemes offering functionalities for access control that, similarly to IBE and ABE, rely on pairing-based cryptography. HIBE is specifically designed to target hierarchical organizations where a notion of descendants exists. FE is a powerful construction that can arbitrarily encapsulate programs as access control, but is unsuitable for practical use [fischiron].
Proxy re-encryption [ateniese2006improved] is a cryptographic system in which the owner of some encrypted data can delegate the re-encryption of her data to a proxy, with the intent of sharing it with another user. For the re-encryption to take place, the data owner needs to generate and transmit to the proxy a re-encryption key. The scheme proves to be beneficial for the cloud environment, as the re-encryption and the storage of the data can happen on the same premises. A number of approaches have shown how proxy re-encryption can be combined with identity-based encryption [green2007identity], or with attribute-based encryption [green2011outsourcing, sahai2012dynamic]. Differently than proxy re-encryption, our construction does not require users to send transformational keys to the administrators. Therefore, even if the administrators would be hosted on the cloud storage premises, they do not act as proxies.
The related research area of multicast communication security [stinson2005cryptography, canetti1999multicast] defines efficient schemes focusing exclusively on revocation aspects. Logical Key Hierarchy [wallner1999key] is a re-keying scheme in which communications for revocation operations are minimized to logarithmic sizes. Other schemes [fiat1993broadcast, naor2000efficient] exploit a secret sharing mechanism, considering that no coalition of revoked users larger than a threshold number is trying to decrypt the transmission.
Vii-B Cryptographically Protected Untrusted Storages
The shared cloud-backed file system (SCFS) designed by Bessani et al. [bessani2014scfs] offers confidentiality guarantees to users by encrypting data stored by the clouds on the client-side. Even though the encryption keys are distributed among multiple cloud storages through secret sharing schemes, the access control is not cryptographically protected, but stored and enforced from a trusted coordination service. We argue that this approach is not secure enough because it does not protect from curious administrators. The global access control structure can be compromised if an attacker gain access to this service.
CloudProof [popa2011enabling] is a secure cloud storage system offering guarantees such as confidentiality, integrity, freshness and write-serializability. To enforce access control, CloudProof makes use of broadcast encryption to protect the keys that are used for encrypting and signing the actual data. Unlike our construction, CloudProof does not offer the zero knowledge guarantee for membership operations. Moreover, CloudProof does not discuss how the authentic identity of the users in the broadcast set is established (e.g. during a group creation operation). Hypothetically, a PKI could be employed for this task, thus requiring a trusted entity in the system. However accessing regularly the PKI would add a significant overhead. In order to mitigate these issues, our solution relies on the identity-based version of broadcast encryption.
REED [li2016rekeying] considers the problem of rekeying in the context of honest-but-curious deduplicated storages. To provide access control, REED relies on ABE. However, as noted by the authors in their evaluation, the performance overhead of the rekeying operation drastically increase to several seconds when varying the total number of users up to 500. Considering group sizes of thousands of users (as we do for groups up to one million), ABE becomes impracticable for access control at large scale.
Sieve [wang2016sieve] platform allows users to store their data encrypted in the cloud and then discretionary delegate access to the data to consuming web services. Sieve makes use of attribute based encryption for access control policies and key homomorphism for providing a zero knowledge guarantee against the storage provider. This access control construction has many similarities with ours, however we differentiate exploiting the zero knowledge guarantee for lowering the computational complexity of the access control scheme, IBBE in our case.
SGX has been extensively used in shielding applications and infrastructure platform services like ours that handle sensitive data. Iron [fischiron] is the closest to our proposal in the sense that it takes advantage of SGX to build a practical encryption scheme for an unpractical strategy thus far. Like us, they use an enclave that holds a master secret as root for later key derivations. They target, however, functional encryption. The enclave generates a key that is associated to a function, so that the computation can be performed without revealing the data on top of which it is applied. The results of applying such function, though, are presented in clear. The authors show that this approach outperforms by orders of magnitude other cryptographic schemes that also offer functional encryption.
Other systems relate to ours with regards to the reduction of overhead for an otherwise costlier design. Hybster [Behl:2017:Hybrids], for instance, proposes a hybrid state-machine replication protocol. Hybrid because it does tolerate arbitrary faults but yet it assumes that some nodes may crash. It relies on SGX features such as isolation, replay protection and trusted counters to achieve a parallelization scheme that makes it a viable solution for demanding applications, reaching higher numbers of operations per second in comparison to traditional approaches.
At the level of infrastructure services, SCBR [Pires:2016:SCBR] proposes a content-based routing solution where the filtering step is put inside enclaves, thus allowing the matching of publications against stored subscriptions in a safe manner. It is shown to be one order of magnitude faster than an approach with comparable security guarantees. The gain comes from the plaintext operations done inside the enclave against the counterpart that needs to perform computations over encrypted data.
We have introduced IBBE-SGX, a new cryptographic access control extension that is built upon IBBE and exploits Intel SGX to derive cuts in the computational complexity of IBBE. We propose a group partitioning mechanism such that the computational cost of membership update is bound to a fixed constant partition size rather than the size of the whole group. We have implemented and evaluated our new access control extension in a single administrator with multiple users set-up. We have conducted both real and synthetic benchmarks, demonstrating that IBBE-SGX is efficient both in terms of computation and storage even when processing large and dynamic workloads of membership operations. Our innovative construction performs membership changes 1.2 orders of magnitude faster than the traditional approach of HE, producing group metadata that are 6 orders of magnitude smaller than HE, while at the same time offering zero knowledge guarantees.
There are a number of interesting avenues of future work. The first is to dynamically adapt the partition sizes based on the undergoing workload. This would optimize the speed of administrator- and user-performed operations. A second challenge would be to adapt our construction to a distributed set of administrators that would perform membership changes concurrently on the same group or partition, by using lock-free techniques. Third, in a setup with multiple administrators, one can envision certifying blocks of membership operations logs through blockchain-like technologies.
The research leading to these results has received funding from the French Directorate General of Armaments (DGA) under contract RAPID-172906010. The work was also supported by European Commission, Information and Communication Technologies, H2020-ICT-2015 under grant agreement number 690111 (SecureCloud project) and partially supported by the CHIST-ERA ”DIONASYS” project.
This appendix details the mathematical implications of adapting IBBE to IBBE-SGX. In typical IBBE schemes, TA only execute the operations of system setup and extracting user secret keys. In IBBE-SGX, however, the administrator agent executes all membership operations by executing them inside an SGX enclave.
The IBBE scheme [Delerablee:2007:IBBE] conceptually relies on the idea of bilinear maps. Notated as: , a bilinear map is defined by using three cyclic groups of prime order , imposing bilinearity and non-degeneracy. El Mrabet et al. [el2017guide] provide a thorough overview of bilinear maps usage within the cryptographic setting. Moreover, the IBBE scheme implies the public knowledge of a cryptographic hash function , that maps user identity strings to values in .
A-a System Setup
The initial operation is identical for IBBE and IBBE-SGX. The algorithm receives as input, where represents the security strength level of the cryptosystem, and encapsulates the largest envisioned group size. The output consists of the Master Secret Key and the system Public Key . To build , the algorithm randomly picks : . To construct the , the algorithm computes and , where was randomly picked: . The computational complexity of the system setup algorithm is linear to .
A-B User Key Extraction
The key extraction operation is identical for IBBE and IBBE-SGX. For a given user identity , the operation makes use of and computes : .
A-C Encrypt Broadcast Key
The algorithm for constructing a broadcast key differs by considering the specific usage assumption. If for IBBE the algorithm has to rely on , for IBBE-SGX one can make use of . In both cases, the group broadcast key is randomly generated by choosing a random value and computing:
A group broadcast ciphertext is then constructed by:
For IBBE, cannot be used directly for computing . Instead, the computation is carried out with a polynomial expansion of the exponent that uses the public key elements:
For IBBE, computing is bound by the computations of all , thus requires a quadratic number of operations . In the case of IBBE-SGX, having access to allows computing directly using Formula 3. It thus requires a linear number of operations.
Moreover, we augment the ciphertext values with , which will prove useful for the subsequent operations:
Note that can be stored publicly as it can be computed entirely from .
A-D Decrypt Broadcast Key
The decrypt operation is executed identically for IBBE and IBBE-SGX, and relies on . A user can make use of her secret key to compute , given We chose to omit presenting the intricate formula as we maintain it in the original form, as shown in [Delerablee:2007:IBBE].
A-E Add User to Broadcast Key
As the joining user is allowed to decrypt group secrets prior to joining, there is no need of a re-key operation by changing the value of . The only required change is therefore to incorporate into , and into .
For IBBE, including into all values requires a quadratic number of operations. For IBBE-SGX, by making use of , one has access to , thus the new user is included in constant time: .
A-F Remove User form Broadcast Key
Within the traditional assumption, is computed similarly to encrypting group key operation, consuming a quadratic number of operations. Within IBBE-SGX, having access to through the allows first changing and then in constant time:
A-G Re-key Broadcast Key
Sometimes, it is necessary to change the value of without performing any group membership changes. This re-keying operation can be performed optimally in constant time under both usage model assumptions, by making use of .