Towards Practical Encrypted Network Traffic Pattern Matching for Secure Middleboxes

01/07/2020
by   Shangqi Lai, et al.
CSIRO
Monash University
0

Network Function Virtualisation (NFV) advances the development of composable software middleboxes. Accordingly, cloud data centres become major NFV vendors for enterprise traffic processing. Due to the privacy concern of traffic redirection to the cloud, secure middlebox systems (e.g., BlindBox) draw much attention; they can process encrypted packets against encrypted rules directly. However, most of the existing systems supporting pattern matching based network functions require tokenisation of packet payloads via sliding windows at the enterprise gateway. Such tokenisation introduces a considerable communication overhead, which can be over 100× to the packet size. To overcome the above bottleneck, in this paper, we propose the first bandwidth-efficient encrypted pattern matching protocols for secure middleboxes. We start from a primitive called symmetric hidden vector encryption (SHVE), and propose a variant of it, aka SHVE+, to enable encrypted pattern matching with constant, moderate communication overhead. To speed up, we devise encrypted filters to further reduce the number of accesses to SHVE+ during matching. We formalise the security of our proposed protocols, and implement a prototype and conduct comprehensive evaluations over real-world rulesets and traffic dumps. The results show that our design can inspect a packet over 20k rules within 100 μs. Compared to prior work, it brings a saving of 94 consumption.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

01/04/2021

Machine Learning based Malicious Payload Identification in Software-Defined Networking

Deep packet inspection (DPI) has been extensively investigated in softwa...
05/28/2020

Mitigating TLS compromise with ECDHE and SRP

The paper reviews an implementation of an additional encrypted tunnel wi...
04/20/2021

Passive, Transparent, and Selective TLS Decryption for Network Security Monitoring

Internet traffic is increasingly encrypted. While this protects the conf...
03/13/2020

ShieldDB: An Encrypted Document Database with Padding Countermeasures

The security of our data stores is underestimated in current practice, w...
10/21/2021

Classification of Encrypted IoT Traffic Despite Padding and Shaping

It is well known that when IoT traffic is unencrypted it is possible to ...
07/14/2020

Measuring the Performance of Encrypted DNS Protocols from Broadband Access Networks

Until recently, DNS traffic was unencrypted, leaving users vulnerable to...
02/20/2018

ISA-Based Trusted Network Functions And Server Applications In The Untrusted Cloud

Nowadays, enterprises widely deploy Network Functions (NFs) and server a...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Large-scale adoption of Network Function Virtualisation (NFV) facilitates easy realisation, deployment, and management of advanced network functions (aka middleboxes) for enterprises. Under this paradigm, cloud data centres become major NFV vendors [16]. Traditionally dedicated and tightly coupled hardware/software is transformed into composable software middlebox modules, which can run on commodity cloud instances with unlimited scalability. Such a significant technology shift also raises crucial privacy concerns because the traffic of enterprises is re-directed and exposed to cloud data centres [26, 22]. Even HTTPS is widely adopted nowadays, commercial middlebox services intercept and decrypt the encrypted traffic in the middle to retain advanced network functions like deep packet inspection (DPI) [18, 29].

To address this privacy concern and promote secure adoption of NFV, privacy-preserving middlebox systems [19, 28, 15, 8, 17, 7, 23, 11]

have received much attention; these middleboxes can process encrypted traffic against encrypted processing rules without decryption. As a result, both sensitive traffic payloads and proprietary middlebox rules are protected without sacrificing the underlying operations of network functions, such as pattern matching, header inspection, and regular expression. Existing studies in this field can be classified into two categories, i.e., software-based solutions 

[19, 28, 15, 8] and hardware-based solutions [17, 7, 23, 11]. Unfortunately, solutions in neither categories are practically deployable due to efficiency or security issues.

Mainstream software-based solutions [19, 28, 15, 8] adapt a cryptographic technique named searchable encryption [5], which allows middleboxes to match the encrypted patterns extracted from the rules against the encrypted string streams (i.e., tokens) parsed from traffic payloads. However, those designs are communication inefficient. The fundamental reason is that traffic payloads need to be tokenised into string streams via sliding windows with varied sizes (i.e., enumerating the sizes of all patterns). As shown in prior work [19, 28], such cost can be tens of times to the original packet size. Consequently, long latency will be introduced in token transmission, which is not acceptable in wide networked applications. Besides, high I/O consumption between the enterprise and cloud will greatly increase the capital cost of data transfer.

Hardware-based solutions [17, 7, 23, 11] rely on hardware enclave (i.e., Intel SGX) to execute middlebox functions in a trusted environment. Traffic is fed into the enclave and processed within it. Although using SGX brings benefits on efficiency and functionality for secure middleboxes, recent side-channel attacks against SGX [24, 25, 13] raise a serious doubt – whether SGX is satisfactory to be deployed in practice.

In order to tackle the above limitations, in this paper, we aim to propose practical cryptographic protocols for a wide range of pattern matching-based secure middleboxes. Our design expects to offer convincing performance in both time and communication towards network environments while ensuring strong protection for rules and traffic payloads.

As mentioned, existing designs based on searchable encryption fall short of achieving bandwidth efficiency. To overcome this bottleneck, we observe that a cryptographic primitive named hidden vector encryption (HVE) [12] is suitable to be a starting point of building a bandwidth-efficient protocol for encrypted network traffic pattern matching. Specifically, HVE generates the key and ciphertext from two same-sized vectors, respectively; HVE decryption can be performed only if the non-wildcard positions of the two vectors are the same. In our context, the packet payload is encrypted from a vector of payload byte stream (), while the rule pattern is encrypted from a predicate vector (, , ) with wildcard () positions. The offset of the pattern in the above vector is specified by the inspection rule. Later, the middlebox can perform HVE decryption on the encrypted payload and pattern to check if there is a match. As a result, the communication overhead with HVE becomes constant because traffic is encrypted in byte-wise. For efficiency, we exclude public-key HVE scheme and resort to symmetric-key HVE scheme, aka SHVE [14].

The original SHVE scheme [14] is designed for encrypted membership testing only, where the message is not embedded in the SHVE ciphertext. Thus, it cannot be directly applied to pattern matching-based middlebox functions like DPI, because a DPI rule contains patterns and the corresponding action (e.g., alert, drop), and the entire rule should fully be protected without matching [1, 28]. To solve this problem, we propose a variant of SHVE called SHVE+ which supports both encrypted byte-wise matching and message encryption. Our new primitive preserves the same security guarantees of the existing secure middlebox systems [19, 28]. The equality of byte strings in a packet payload is fully hidden, and the action can be triggered only if a match is found in the encrypted payload at the specified positions in the DPI rules.

To improve efficiency, we design a fine-grained progress filtering protocol to reduce the number of accesses on SHVE+ during the matching process: if a packet is filtered out, i.e., being identified as a mismatch, the middlebox will not continue to process it. As a result, our middlebox saves processing time significantly as most of the packets are commonly legitimate [4, 28, 21]. To apply filtering to encrypted traffic, we propose an encrypted filter structure via SHVE. It is carefully designed in a way that the encrypted packet payload can be used for both filtering and pattern matching. Namely, introducing the filters does not incur extra bandwidth cost.

For completeness, we formalise the security of our proposed protocols. First, we formally capture the capabilities of adversaries considered in the targeted middlebox system. One adversarial model aims to infer sensitive information from encrypted packets. The other aims to deduce information from encrypted rulesets. We note that security analysis of existing designs [19, 8, 28, 9] falls short of capturing the above adversaries at the same time. To bridge the gap, we adapt the real/ideal paradigm to define two groups of games under the above two adversarial models. We prove that even if an adversary is capable of selecting the packet or ruleset to be challenged in advance, she only learns a controlled leakage profile regarding the packet and ruleset.

We implement a prototype and deploy it on a commodity machine. We use real-world patterns (Snort and ETOpen) and network traffic (iCTF08) to evaluate its performance. Regarding latency, our middlebox system can inspect a packet for ETOpen ruleset (20k+ rules) within s and Snort ruleset (1.5k rules) within s. In a multi-session scenario ( concurrent connections), the throughput per connection reaches packets per second for Snort ruleset and packets per second for ETOpen ruleset. The overall throughput is over GBps and MBps, respectively. Regarding bandwidth consumption, our design consumes the least bandwidth among all prior arts (including the one in [6] also with constant complexity): it only costs 5 times more bandwidth in terms of the original packet size, which saves more than comparing the designs [19, 15, 28]

using tokenisation. Our approach significantly saves the cost of deploying pattern matching middleboxes in the cloud. The cost estimation based on AWS pricing information demonstrates that the monthly maintenance cost of our middlebox is

, which is only one-fourth of the tokenisation-based approaches.

Our contributions can be summarised as follows:

  • We design a new variant of the SHVE scheme called SHVE+, which preserves the functionality, efficiency, and security properties of SHVE while additionally supporting message encryption.

  • We propose the first bandwidth-efficient encrypted pattern matching protocol built from SHVE+, which enables middleboxes to perform pattern matching over encrypted traffic with constant, moderate bandwidth overhead.

  • We propose a secure filter to filter out legitimate packets, and it further improves the efficiency of middleboxes by to . Meanwhile, it does not incur extra bandwidth cost as it reuses the encrypted traffic for pattern matching.

  • We are the first to comprehensively formalise the security of encrypted pattern matching protocols for secure middleboxes. We formally prove that our protocol protects against the adversary who wants to compromise traffic and rules throughout pattern matching, respectively.

  • We implement a system prototype and evaluate it with real-world rulesets and a traffic dump. We evaluate the setup time, storage overhead, inspection delay, bandwidth overhead, throughput, and deployment cost of our system, and compare them with two prior encrypted pattern matching protocols (i.e., BlindBox [19] and SEST [6]).

Ii Related Work

Software-based secure middleboxes. Our work is related to software-based (aka cryptographic) solutions for secure middleboxes. Blindbox [19] is the first system that supports pattern matching based network functions over the encrypted traffic payloads. It is also the first to use searchable encryption for encrypted pattern matching. Later, a line of work is proposed to improve the design of Blindbox, including the realisation of header matching [15, 9], dedicated inspection rule [28] and regular expression [8] support. As mentioned, these designs built from searchable encryption require tokenisation of the payloads, which is a critical performance bottleneck of the system (can lead bandwidth overhead in terms of the original traffic size). There are some other studies which are built from advanced cryptographic tools. Splitbox [1] uses secure multi-party computation techniques for rule matching, which is also not communication efficient.

Hardware-based secure middleboxes. There are also solutions based on trusted hardware, i.e., Intel SGX. These designs [17, 7, 23, 11] aim to achieve the same goal of processing encrypted traffic, yet using trusted hardware enclave. As mentioned, Intel SGX is currently vulnerable to side-channel attacks [24, 25, 13], which can break the security guarantee of the trusted enclave. Besides, deploying those systems requires the cloud servers to be equipped with SGX and enforces the enterprises to trust the hardware vendor. The above constraints would limit the adoption of these SGX-based systems.

Pattern matching on encrypted data. In the literature, some theoretical work also investigates pattern matching on encrypted data. A recent scheme [6] based on cryptographic pairing achieves a constant communication overhead to the packet size. Unfortunately, such a theoretical design still introduces unaffordable bandwidth overhead in practice, i.e., larger than the original traffic. Besides, the pairing based matching operation is too slow to be deployed in traffic processing. A detailed comparison can be found in Section VI.

Some early studies are working on substring matching [3, 10]. Those studies focus on different application scenarios, where a long string is encrypted and stored at the server, and later a substring (pattern) query will be issued to be processed against the long string for matching.

Scheme Communication Cost Storage Cost Inspection delay
SEST [6] High Medium ms level
Splitbox [1] High Medium ms level
Tokenisation [19, 28, 15, 8] High Low s level
Our middlebox Low High s level
TABLE I: Summary of the performance of representative software-based secure pattern matching middleboxes.

To summarise, we present a comparison table (Table I). It shows that our proposed design outperforms the existing cryptographic works [1, 6] in terms of the communication cost and inspection delay. It highly reduces the communication cost in tokenisation-based approaches [19, 28, 15, 8] while preserving a microsecond-level inspection delay. Although its storage cost is higher than the other solutions, it is not an issue for in-cloud middleboxes. In networked applications, latency is crucial to user experience and quality of service. The latency of traffic processing is more sensitive to bandwidth, while rule encryption and upload is one-time setup cost. Besides, bandwidth is much more expensive than storage in the modern cloud (see Section VI). Our solution offers a significant maintenance cost saving in the real-world deployment.

Iii Overview

Iii-a System Architecture

Our proposed design employs the same architecture as existing secure middleboxes [27, 28, 15, 26] (just to list a few); it redirects an enterprise’s traffic from the enterprise gateway to a third-party middlebox service for pattern matching-based packet processing. During this process, the enterprise leverages the middlebox to thoroughly inspect all traffic and enforce its security rules to defend against malicious activities. In addition, the enterprise aims to protect the ruleset in the outsourced environment, because this can either be proprietary ruleset subscribed from professional vendors [19]

or customised open-sourced ruleset with private information 

[28], e.g., enterprise’s trade secrets, or intellectual property.

Fig. 1 presents the system architecture111If an enterprise endpoint connects to an external network, the processed traffic from the middlebox is sent back to the gateway, then sent out [27, 15].. It has two parties: the gateway (GW) maintained by the enterprise and the middlebox (MB) deployed in the service provider, like public clouds. We also use the term “endpoint” to denote the server within the enterprise. The system flow involves three phases:

Initialisation. Before initiating any connection, GW randomly chooses a key and uses it to generate encrypted rules to be used by MB for detecting malicious packet payloads. In practice, each rule describes an attack via its representative patterns, which may include suspicious strings in the payload and the offset information for the string [20]. Each rule also indicates the corresponding actions (e.g., alert, drop) once a match is found. Thus, GW creates an encrypted list for the pattern-action list extracted from the ruleset, which is later used for MB to match those strings in the encrypted payload and perform the associated action. Meanwhile, GW builds an encrypted filter, which can quickly process the mismatches in traffic, and it accelerates the pattern matching process. The generated encrypted filter and pattern list are uploaded to MB. Later, MB can perform packet inspection for all incoming traffic through the encrypted filter and pattern list.

Preprocessing. GW should preprocess the packet payload before sending it for inspection. Specifically, GW scans the packet payload in byte-wise and uses to generate encrypted traffic dedicated to the pattern matching service like DPI. Then, GW will send the encrypted traffic from the enterprise network to MB.

Inspection. Upon receiving the encrypted traffic, MB withholds the incoming traffic and executes the proposed encrypted pattern matching protocol to inspect the traffic with the pre-computed encrypted pattern list. If an action can be recovered after checking the encrypted patterns, MB will apply the action to the packet; otherwise, the packet is considered as legitimate, and MB then sends it out to the external network. To improve the efficiency of the above process, MB exploits a secure filtering protocol. In specific, for each packet, MB utilises the pre-built encrypted filter to quickly evaluate whether the current position in the encrypted traffic is a possible matching position. As a result, MB separates the innocuous input from the possible malicious traffic, and it only runs the pattern matching protocol for those possible matching positions instead of checking the whole traffic with the encrypted patterns.

Fig. 1: System Architecture. The arrows indicate traffic from the sender network to the receiver network; the response traffic follows the reverse direction.

Remark. Following prior studies [19, 28] that support pattern matching over the encrypted traffic, the real network traffic is protected by SSL. That is, the sender gateway initialises a normal SSL connection with the receiver and sends the SSL traffic with encrypted traffic to MB for inspection. The receiver can use its SSL session key to recover the real network traffic.

Iii-B Threat Assumption

We assume that GW in the enterprise network is a trustworthy party. It follows the proposed protocol and does not disclose the ruleset to other parties. On the other hand, the MB service provider is assumed to be semi-honest. It also follows the protocol to offer pattern matching service but attempts to extract sensitive data from the encrypted traffic passing through the middlebox and infer the private ruleset owned by the enterprise. Also, the middlebox can be compromised or eavesdropped as it is deployed in an untrusted environment [19]. Therefore, the main goal of the proposed system is to hide both the content of traffic and the ruleset from MB while allowing MB to perform pattern matching over the encrypted traffic.

We also assume that at least one endpoint in the communication is honest. This is consistent with the threat model in existing privacy-preserving pattern matching middleboxes [19, 28, 8]. Note that detecting two malicious endpoints is an orthogonal work, and we do not consider this case in our paper.

Iii-C Building Blocks

Basic cryptographic tools. We leverage pseudo-random function , which is a polynomial-time computable function family that is computationally indistinguishable from random functions to any probabilistic polynomial-time adversary. Besides, we make use of symmetric key encryption scheme , which consists of three probabilistic polynomial-time algorithms . generates the secret key . A message can be encrypted as a ciphertext and decrypted by . The formal definitions of the and can be found in [14].

SHVE. SHVE [14] is a predicate encryption scheme that supports conjunctive, equality, comparison and subset membership queries over the encrypted data. Compared to the public-key HVE schemes [12], SHVE is much faster as it only relies on the and symmetric key encryption. We present a brief definition of SHVE on below.

Let be an attribute set and be a wildcard symbol (“don’t care” value). We define . Let with be an attribute vector, and with be a predicate vector. The predicate function if and only if for each , we have or . In other words, the predicate function returns “” only when the vector matches in all non-wildcard positions. The SHVE scheme uses a PRF and the symmetric key encryption as described above. It comprises four probabilistic polynomial-time algorithms:

  • : On input the security parameter , the algorithm outputs the master secret key .

  • : On input the master secret key and a predicate vector , the algorithm outputs the query trapdoor , where is a masked random key, is a symmetric ciphertext and keeps all non-wildcard positions in .

  • : On input the master secret key and an attribute vector , this algorithm sets for each , and outputs the ciphertext .

  • : The query algorithm takes as input a trapdoor and a ciphertext . If the algorithm recovers from and , the query algorithm outputs “True” (indicating ) else it outputs .

Iv The Proposed System

Iv-a Construction of SHVE+

SHVE [14] can be adapted to achieve efficient encrypted pattern matching for network traffic. However, it cannot be directly used for middlebox functions like DPI. As mentioned in Section III-A, the inspection rule consists of the inspection patterns and corresponding action. To fully protect the rules during the matching process, both of them should be encrypted. Also, to preserve the functionality, the action needs to be recovered for MB further processing when the pattern is matched. To this end, the action should be considered as a message encrypted with the pattern in SHVE. Similar to the design in prior work [28, 1], the above design encrypts both the rule and action to minimise the leakage in the outsourced middlebox. Nonetheless, we also take the performance into consideration and choose to reveal the action for those matched patterns. This trade-off enables our middlebox to efficiently and securely handle a large volume of packets at a moderate cost and well-defined leakage. We note that the original SHVE construction [14] can only be used for membership testing, whereas the message encryption is yet to be supported. To address this issue, this section presents a new SHVE scheme, dubbed SHVE+, which enables message encryption on SHVE.

Construction. The original SHVE (see Section III-C) leverages a random key to encrypt “” in the SHVE trapdoor ( in the trapdoor), and it refers to the predicate vector to masks the random key and keeps the masked key in . If matches the attribute vector encrypted in the ciphertext at all non-wildcard positions, the encrypted “” can be recovered from the trapdoor after , and SHVE outputs “True”. Intuitively, we can exploit the term in the trapdoor to store the other encrypted message. Then, is changed to return the decryption of after decrypting successfully.

We now present the details of our SHVE+ construction. Note that only the modified algorithms are given here, the other algorithms remain the same as in Section III-C.

  • : On input the master secret key , a predicate vector and a message , the algorithm extracts all non-wildcard positions from . Let these positions be , the algorithm samples and computes: , Finally, it outputs the trapdoor corresponding to the predicate vector .

  • : The query algorithm takes as input a trapdoor and a ciphertext . Then, it computes and returns .

Under the SHVE+ scheme, the proposed system can encrypt the pattern as a SHVE+ trapdoor and then encrypt traffic as the SHVE+ ciphertext to make the inspection. In specific, GW in the proposed system generates a pattern array initialised with wildcard character ‘’ in all positions. Then it inserts each byte of the pattern string into the pattern array according to the rule (string content, start/end position), and uses the array as the predicate vector and the action as the message to compute the encrypted pattern via . Later, GW parses the traffic into a byte array and uses it as the attribute vector to get the encrypted traffic by . Finally, on MB, the encrypted pattern can examine traffic in the form of SHVE+ ciphertext and properly recover the action if a match is found according to the definition of SHVE [14].

Security. SHVE+ retains SHVE’s security properties for membership testing, which guarantees that the pattern matching process only reveals whether the encrypted traffic includes the pattern in the position specified by rules, but nothing more. Moreover, it ensures that the message can only be recovered when the traffic matches the pattern, which is consistent with the security requirement of the proposed middlebox service. A detailed analysis is given in Section V.

Fig. 2: An example of the encrypted pattern generation: each pattern string is inserted into multiple matching arrays to match the pattern in every possible position in the payload.
1:The master secret key from ; the ruleset
2:The encrypted pattern list
3:function Generate(, )
4:     Parse as a pattern-action list
5:     for each in  do
6:          
7:          
8:          for  do
9:               
10:               Insert at
11:               
12:               Store in                
13:     return
Algorithm 1 Encrypted Rule Generation
1:GW inputs the master secret key , the payload ; MB inputs the encrypted pattern list ;
2:function Match()
3:On GW:
4:     Parse as a byte array and compute the encrypted traffic
5:     Send to MB
6:On MB:
7:     for each encrypted pattern in  do
8:          
9:          Execute if it is valid      
Algorithm 2 Rule Matching

Iv-B The Proposed Encrypted Pattern Matching Protocol

In order to support secure pattern matching over the encrypted traffic, the existing work [19, 28] leverages an encrypted index built from the pattern-action list. More specifically, the encrypted index is indexed by the encryption of each pattern string. When a given inspection token matches the encrypted indexing term, MB can recover the action from the index and execute it. However, due to the complexity of matching patterns (various size, matching position, etc.), this approach has to tokenise the original packet payload into a large number of tokens, and it can blow up the bandwidth consumption ( as reported in [19]). To enable pattern matching in a bandwidth-saving manner, our system is built from SHVE+ because it does not rely on any tokenise algorithm. Instead, it encrypts the payload and queries the pattern in byte-wise. Consequently, its bandwidth consumption is a constant no matter how long the pattern is (see Section IV-A).

Pattern matching for arbitrary pattern strings. Algorithm 1 summarises the detailed encrypted rule generation procedure run by GW. As mentioned, each inspection rule is parsed as a pattern-action tuple, and our protocol generates the encrypted pattern list from it. In practice, the inspection rules often involve qualifiers that specific a range of positions to be checked in the packet payload. For example, the following Snort rule [20] specifies the “depth” (only search 8 bytes instead of 1500 bytes for the pattern) and “offset” (start to search the pattern from the th byte of the payload). [fontsize=,frame=single] alert udp HOME_NET 111 (flow:to_server; content:”—00 01 86 A0—”, depth 8,offset 12) Thus, our protocol takes the above two qualifiers into consideration when generating pattern arrays and encrypted patterns. As shown in Fig. 2, our protocol first generates pattern arrays with wildcard character ‘’. Then, it inserts the pattern string into the pattern arrays at each possible starting positions, which is 12 to 17 in our example. For each pattern array, our protocol inputs the action and runs to encrypt the pattern string and position after concatenating them together (see Section IV-A). This ensures that matches only happen on the positions specified by the rule. The result encrypted pattern list is able to match the pattern in the specific positions over any incoming traffic from GW.

The above protocol supports encrypted pattern matching in wildcard positions, i.e., the pattern can be found in all positions in a packet. For this case, our protocol generates encrypted patterns for all positions to find matches in traffic. Due to the MTU restriction, the maximum payload size is bytes, which means that each rule needs encrypted patterns at most to match all position. Note that the size of each encrypted pattern is a constant (see Fig. 2) without regarding the length of original pattern strings. Moreover, it is a tiny data structure: each encrypted pattern is only bytes (see Section VI).

The matching process is outlined in Algorithm 2. After uploading the encrypted pattern list to MB, GW generates the encrypted traffic from the packet payload via . Later, MB uses the pattern list to check the traffic from GW via . If a target packet includes a required pattern, MB can recover an action and apply it to the packet.

Fig. 3: The proposed filter structure

Matching packet header and regular expression. The protocol can be extended to support packet header inspection. In particular, the header inspection focuses on the field information (e.g., HTTP header, HTTP method), which can only be found in the specific place in the header. Thus, the protocol can parse the field information by extracting the field value as the pattern string and referring the header structure to compute the positional information. Finally, the processed header inspection rule can be used by the original protocol to tackle with patterns that appear in the header.

For regular expression matching, it is common for the real-world pattern matching system like Snort to parse the regular expression as sub-strings and apply pattern matching algorithms to check those sub-string respectively [4, 21]. For instance, the regular expression “ap*e” aims to find the string start with ‘ap’ and end with ‘e’. The pattern matching system checks ‘ap’ and ‘e’ separately and returns match if the matching position of ‘e’ is behind the one for ‘ap’. Therefore, our protocol can follow the same strategy to check the regular expression for the encrypted traffic. That is, the protocol generates encrypted patterns for ‘ap’ and ‘e’ separately and leverages the secret sharing scheme in [28] to share the action into two encrypted patterns, and the action can only be recovered when two encrypted patterns are matched orderly.

Remark. The security properties of SHVE+ guarantee that the action can only be recovered if the encrypted payload includes the pattern (i.e. both the position and string should be matched). Also, SHVE+ ensures that the equality of byte strings in the packet is not revealed to MB, because the SHVE+ combines the packet payload and positions when generating the encrypted payload. Thus, the ciphertext of two identical bytes is different if they are in the different position of the payload. A detailed security proof is given in Section V.

Regarding the efficiency of the basic matching protocol, for each pattern, it performs a byte-to-byte match on the incoming traffic. Recall that the protocol generates multiple encrypted patterns to match all specified starting positions of patterns in the traffic, its performance can be optimised via parallel processing. In particular, the middlebox can use those independent encrypted patterns to perform pattern matching in specified positions concurrently on the encrypted traffic. However, the drawback of this basic matching protocol is that its performance may degrade rapidly with the increasing size of the ruleset. That is because each newly-added rule can have up to more corresponding encrypted patterns. It indicates that MB may need to perform more on a given packet if the size of the ruleset increased by one. Next, we will introduce a secure filter to address the above issue.

1:The master secret key from ; the ruleset
2:The encrypted filter
3:function FilterGenerate(, )
4:     Parse as a pattern-action list
5:     for each in  do
6:          
7:          
8:          
9:          for  do
10:               
11:               Insert at
12:               
13:               if  then
14:                    Store in
15:               else
16:                    Store in
17:                    
18:                    
19:                    
20:                    Insert at
21:                    Store in                               
22:     return
Algorithm 3 Encrypted Filter Generation

Iv-C Secure Filtering

One key observation is that only a small fraction of the traffic includes malicious payloads (less than as shown in [4]). Consequently, if we can efficiently distinguish the innocuous traffic from the malicious one, and only do pattern matching on the malicious traffic, the performance of the overall middlebox system can be highly improved. To achieve our goal, we propose a secure filter system that can quickly evaluate whether the packet includes a match and where is the possible starting position to match. Also, the filter is encrypted to prevent MB from learning any private information about the traffic and ruleset as in Section III-B.

The proposed filter consists of three filters that run in two-level (see Fig. 3). The first level has two filters: Filter 1 stores information about the pattern strings that less than characters (bytes), while Filter 2 accounts for the longer patterns. Both of them keep an encrypted pattern list of the beginning two bytes of each pattern string; it also combines the position information to check all position in the packet. The filter in the second level (Filter 3) works together with Filter 2; it is a progressive filter generated from the next 2 bytes in the pattern string. The progressive filter matches the following two bytes in each pattern string if it matched in Filter 2, and it reduces the false positive rate when matching a longer pattern. Note that similar design philosophy is also adapted in plaintext traffic pattern matching systems [4, 21].

Algorithm 3 presents the steps of building the encrypted filter. For each pattern string, GW extracts the first two characters and generates encrypted patterns via . Then, it inserts the encrypted patterns into either or by referring the length of the pattern string. For those longer patterns (more than 3 bytes), the next two bytes are also generated as encrypted patterns and stored in . Finally, GW uploads with the above three sub-filters to MB as the encrypted filter.

To execute the secure filtering algorithm (cf. Algorithm 4), MB reuses the encrypted traffic to check the encrypted filter. In specific, as SHVE supports secure membership testing, MB is capable of recovering a “True” after if the upcoming payload has two bytes that match the ruleset pattern. After applying the secure filtering, MB only requires to check the position that returns “True” when running the following pattern matching process. Hence, the secure filtering can highly boost the overall pattern matching procedure, because for each rule, instead of using all encrypted patterns to check the whole encrypted traffic, only a few patterns corresponding to the filtered positions need to be checked.

1:GW inputs the master secret key , the payload ; MB inputs the encrypted filter
2:A list of possible matching positions
3:function Filtering(, , )
4:On GW:
5:     Parse as a byte array and compute the encrypted traffic
6:     Send to MB
7:On MB:
8:     for i = 0 to  do
9:          if “True” then
10:               Add to                
11:     if  then
12:          for i = 0 to  do
13:               if “True” then
14:                    for j = 0 to  do
15:                         if “True” then
16:                              Add to                                                                            
17:     return
Algorithm 4 Secure Filtering

Filtering in parallel. The secure filtering relies on two separate groups of filters (filters for pattern bytes and bytes). Therefore, we can use the output to check the encrypted pattern corresponding to the pattern string bytes and bytes, respectively. This can reduce the workload in the pattern matching process further because only the pattern that fits the size requirement needs to be checked after adopting this optimisation. To achieve this, we slightly modify Algorithm 4: The matching positions output from and are kept in two matching position lists ( and ). Also, we employ two separate buckets to store the encrypted patterns for the pattern strings bytes and bytes separately. As a result, MB can use the position information in to check the short patterns while utilising to check those longer patterns.

V Security Analysis

We give a security analysis to demonstrate that MB cannot learn the sensitive data in the ruleset as well as traffic during the pattern matching process. We are the first to formalise the adversary capability in two aspects: 1) The adversary can select the packet to be challenged and get the encrypted patterns and filter selected by himself/herself. The goal of the adversary is to learn the sensitive data in the packet; 2) The adversary can select the ruleset to be challenged and get the encrypted packet chosen by himself/herself. The adversary aims to learn information about the ruleset other than the pattern matching result. Note that the existing work only considers either the security of the packet [19, 8] or the security of the ruleset [28, 9].

We follow the simulation-based security [14] to define a leakage function for our encrypted pattern matching protocol and then construct a simulator to show that is -secure against adversaries as described in above. More specifically, we construct a simulator and prove that can simulate by using the leakage function only. This implies that the proposed protocol does not reveal any information about the packet payload and rules beyond the leakage function.

Security of SHVE+. SHVE+ has a similar security model as SHVE [14] except that SHVE+ has a non-empty message space to support message encryption. Recall that the security model of SHVE defines the attribute-hiding property, which indicates that the adversary can only learn two leakage functions: representing the wildcard pattern (positions) of a given predicate vector , and describing the leakage after queries (i.e., leaking whether and are matched or not). An adversary can arbitrarily request the HVE trapdoors given the above two leakage functions to . However, no more information about and will be leaked. The following theorem from [14] states the security of SHVE:

Theorem 1

SHVE is attribute-hiding in the ideal cipher model under the security model defined in above.

We keep the unchanged because the wildcard pattern of SHVE+ is exactly the same as in SHVE, while the definition of is modified as follows: if , otherwise . The following theorem states the security of SHVE+:

Theorem 2

SHVE+ is attribute-hiding in the ideal cipher model under the security model defined in above.

We omit the proof of Theorem 2 because it is identical to the one in [14]. In the rest of this section, we directly apply the simulator of SHVE+ when simulating .

Security of the pattern matching protocol. Let be the pattern matching protocol. The security of is formally defined via two groups of real/ideal game definitions. The first real/ideal game definition depicts the security of against the adversary who aims to compromise the confidentiality of the encrypted packets. This adversary is identical to the one in [18] who entrenches in MB and can get any number of the encrypted patterns and filter tokens to examine the encrypted packet. The following games and theorem state that can protect the packet confidentiality in the presence of :

  • : The adversary chooses a packet for the game to generate the encrypted packet and gives to . Then, adaptively chooses a rule to query. To respond, the game and . Later, the game gives the protocol outputs to . Eventually, outputs a bit.

  • : The game initialises a counter and an empty rule list . The adversary chooses a packet , and the game runs and gives the encrypted packet to . Then, adaptively chooses a rule to query. To respond, the game records the rule as , and gives the output of ( keeps all history rules) to . Later, the game increases by . Eventually, outputs a bit.

Theorem 3

is -secure against , assuming that the SHVE+ scheme is selectively simulation-secure, that the SHVE scheme is selectively simulation-secure.

Proof:

First, we describe the leakage function towards . On input a packet and the adaptively chose ruleset , the leakage function can be parameterised as formed as follows:

  • is the possible matching positions in w.r.t. . Formally, is an array of possible matching positions in w.r.t. .

  • is the matched positions in w.r.t. . Formally, is an array of matched positions in w.r.t. .

  • is the actions that need to be performed on . Formally, if is matched in , , otherwise, .

Next, we show that we could combine the above leakage function, and to simulate . Suppose provides a packet to . invokes to generate the ciphertext of and gives it to . Upon receiving the -th rule from , refers to to simulate the filter. In specific, for each possible matching position , sets as and as “True”. If , additionally sets as and as “True”. Then, it runs and (if ) to get the corresponding filter. Similarly, for each matched position , sets and . then calls to get the token for . finally receives the simulated filter and encrypted pattern corresponding to .

It is obvious that the simulated ciphertext is indistinguishable from the real ciphertext as it is computed via . Additionally, Theorem 1 and Theorem 2 directly ensure that the simulated trapdoors for the filter and encrypted patterns are indistinguishable from the real trapdoors generated by . Thus, it concludes that for every adversary

, it has a negligible probability to learn more information from

than the defined leakage function .

Theorem 3 shows that the adversary cannot infer any information about the packet beyond after receiving the encrypted pattern and filter. It indicates that MB cannot know any information about a legitimate packet as it does not match any rule. Meanwhile, to fulfil the requirement of pattern matching middleboxes, MB is allowed to learn the matching information and action on a malicious packet. The revealed information enables the pattern matching middlebox to apply the inspection rule on the malicious packet efficiently.

We also consider the adversary who wants to learn unintended information from the ruleset, which captures the capacity of adversaries either in endpoints. Similar to the adversary in [28], is able to use arbitrary packet payload to examine the ruleset deployed on MB. The following games and theorem indicate the security guarantee on the ruleset in the presence of :

  • : The adversary chooses a ruleset and a packet list . Then, the game runs and to generate the encrypted pattern and filter . Later, the game runs and for . The ciphertext of and pattern matching results are stored in as the transcript. The generated , and are given to . Eventually, outputs a bit.

  • : The adversary chooses a ruleset and a packet list . Then, the game runs and gives the outputs to . Eventually, outputs a bit.

Theorem 4

is -secure against , assuming that the SHVE+ scheme is selectively simulation-secure, that the SHVE scheme is selectively simulation-secure.

Proof:

We start with the definition of : Let be a query packet and is a rule in the given ruleset. We have formed as follows:

  • is an array storing the length of each rule, i.e., is the length of .

  • is the size of the encrypted pattern list.

  • is the size of the encrypted filter.

  • is the possible match position pattern of each packet w.r.t. each rule , which is a bidimensional array: is all positions in that probably match .

  • is the matched position pattern of each packet w.r.t. each rule , which is a bidimensional array: is all positions in that match .

  • is the action pattern of each packet w.r.t. each rule , which is a bidimensional array. If matches in , , otherwise, .

  • is the intersection pattern of any two rules matched the packet , which is a three dimensional array. Particularly, stores the intersection position of .

Next, we show how to simulate via , and . First, initialises a bi-dimensional array to store the auxiliary information for the simulation. Then, leverages the outputs from and to simulate the encrypted packet. In specific, for each query packet :

  1. generates an empty array .

  2. For each rule :

    For each matched position of , checks whether is set as the wildcard symbol.

    If any of the above value in is wildcard and , then simulate a ciphertext .

    Otherwise, if , .

    If is not set, sets .

  3. For all empty entries in , randomly generates a ciphertext and fills it into those empty entries.

In the next stage, leverages the encrypted packet and the leakage functions to simulate the encrypted patterns, filter and transcript, for each encrypted packet :

  1. puts into and sets as .

  2. For each rule :

    puts , and into .

    For each matched position of , sets and .

    calls to get the corresponding encrypted pattern and put it into .

    For each possible match position of , sets and ; If , also sets and .

    Then, calls and (if ) to get the corresponding filter and put it into .

Finally,

generates dummy HVE trapdoors to pad

and to and , respectively.

Due to the security properties of SHVE and SHVE+, the adversary cannot distinguish the real and simulated and . Moreover, the transcript is also indistinguishable since the ciphertext is simulated under the ideal cipher model, and the query history under real and ideal games are identical. Even if the adversary uses the and to examine the ciphertext, the output result is also indistinguishable. Therefore, only has a negligible probability to learn more information than the defined leakage function from the ruleset.

Theorem 4 shows that the adversary cannot get information about the ruleset more than after receiving the ciphertext of chosen packets. This guarantees an untrusted MB cannot learn the ruleset with arbitrary legitimate packets. On the other hand, as a part of the pattern matching middlebox requirement, the matching information and action can be revealed towards the malicious packet. Hence, MB can still effectively inspect the packet and execute actions on malicious packets.

(a) Insepction latency
(b) Bandwidth Overhead
(c) Throughput
Fig. 4: The performance evaluation for the proposed middlebox module.

Vi Experiment and Evaluation

Environment setup and implementation. We choose two open-source rulesets, i.e., Snort ruleset [20] (1522 rules, 1116 patterns) and ETOpen ruleset222Emerging Threats ruleset: https://rules.emergingthreats.net (24804 rules, 12634 patterns) to initialise our pattern matching middlebox module and use the traffic dump iCTF08333iCTF08 dumps: https://ictf.cs.ucsb.edu/archive/2008/dumps/ to evaluate its performance.

We implement the middlebox module and a gateway client in C++. Recall that combines each pattern string with all possible positions to generate the encrypted pattern list. This treatment ensures that matching can only happen in the positions indicated in the rules. To save storage, we choose AES-CMAC as the PRF to implement and and truncate the output of PRF to 5 bytes as in [19, 27]. Because the PRF outputs in the above scheme are used to mask the random key for the symmetric key encryption scheme , truncating them does not affect the correctness of them. Hence, we can use the PRF to mask a 5-byte random value and employ a KDF (Key Derivation Function) to generate the key for from the random value. Also, we substitute the non-wildcard position array (see Section IV-A) to a 2-byte integer value indicating the length of each pattern. Note that this will not affect the correctness of SHVE+, because those positions represent a continuous string under the pattern matching application. After optimisation, each encrypted pattern requires bytes (1 PRF value + 1 AES ciphertext + 1 pattern length), and the encrypted pattern list for Snort ruleset costs MB, while the one for ETOpen requires MB. On the other hand, the secure filtering protocol generates SHVE trapdoors for the beginning 2 bytes of each distinct pattern, and it further generates SHVE trapdoors for the following 2 bytes if the pattern is larger than 3 bytes. We observe that the filtering protocol generates a MB filter from Snort ruleset, and MB for ETOpen ruleset. These storage costs are moderate to a cloud server where the middlebox is supposed to deploy. In addition, computing and uploading the above lists are one-time costs in the initialisation phase, and it enables the middlebox to save bandwidth during the inspection phase tremendously.

For performance evaluation, we deploy the middlebox on a server equipped with Intel Core i7-6700 3.4GHz CPU and 16GB RAM and use a desktop with Intel Core i5-6500 3.2GHz CPU and 8GB RAM as the gateway client.

Performance evaluation. First, we show the setup time for the middlebox module. For the generation of the encrypted pattern list, it runs for each pattern and its all possible positions. This takes s in Snort ruleset and s in ETOpen ruleset. Similarly, the filter generation combines each distinct 2 bytes extracted from the beginning of pattern with all possible positions and runs operations to get the filter trapdoor, it also processes the next 2 bytes for the longer pattern. Our evaluation shows that it requires s and s for our two rulesets, respectively.

Next, we report the runtime performance of the middlebox module. In Fig. (a)a, we evaluate the average inspection latency under two rulesets, respectively. For Snort ruleset, the inspection delay is less than s. For the larger ETOpen ruleset, the inspection delay is less than s. We further examine the inspection latency after applying our secure filtering protocol. As a result, the middlebox only takes less than s to inspect a packet in Snort ruleset, and s for ETOpen ruleset, because in the iCTF08 dataset, only packets and packets need further inspection against Snort ruleset and ETOpen ruleset, respectively.

As shown in Fig. (b)b, the bandwidth overhead in our proposed design is a constant, i.e., times in terms of the original packet size. This overhead is much smaller than any existing secure middlebox system using sliding window tokenisation algorithms [6, 19] because those algorithms enumerate all possible window sizes when tokenising the traffic payload. More specifically, for Snort ruleset, the number of distinct pattern sizes is (1 – 214 bytes), and the sliding window tokenisation enlarges the bandwidth consumption by . For ETOpen ruleset, there are more distinct pattern sizes than Snort ruleset, i.e., (1 – 196 bytes). Therefore, the bandwidth overhead of the sliding window tokenisation reaches . In comparison, our system performs encryption and queries in byte-wise. Namely, it only scans the traffic once, and thus, the bandwidth overhead keeps constant, and it saves bandwidth comparing to the sliding window tokenisation in Snort and ETOpen ruleset. Another approach (SEST [6]) with constant bandwidth overhead is based on the elliptic curve. However, the ciphertext in the elliptic curve is much longer than that in our symmetric building blocks, and this approach still leads to a prohibitive bandwidth overhead ().

Ruleset Snort Snort (filter) ETOpen ETOpen (filter)
Throughput MBps MBps MBps MBps
TABLE II: Throughput of our middlebox for different rulesets.

[!t] Scheme Pattern size Traffic size Inspection time SEST [6] BlindBox [19] c Our middlebox

TABLE III: Theoretical performance comparison between the existing pattern matching middleboxes and our middlebox. is the packet size and is the size of pattern string.
  • An array with all distinct pattern sizes in the ruleset

  • Time taken to compute a pairing

  • Time taken to access a tree index in memory

  • Time taken to compute an XOR operation

  • Time taken to decrypt a symmetric ciphertext

We simulate a multi-session scenario (100 to 2000 clients) to measure the throughput of the middlebox on our two different rulesets. The results are given in Fig. (c)c and Table II. As our middlebox can perform filtering and matching efficiently in parallel, the throughput for each connection can reach up to packets per second (pps) for connections under Snort ruleset, and around pps when there are connections, and the overall throughput achieves MBps. For ETOpen ruleset, the throughput per connection still reaches pps for connections, and the overall throughput is MBps.

Scheme Traffic size (bytes)
Inspection time
(1 rule (100 bytes), 1 packet)
SEST [6] ms
BlindBox [19] s
Our middlebox s
TABLE IV: Performance comparison between the existing pattern matching middleboxes and the proposed middlebox using a -byte packet and Snort ruleset.

Comparison between prior designs. We provide theoretical and real-world performance comparisons between SEST [6], BlindBox [19] and our middlebox. Note that the work [15, 28] adopts a similar approach based on searchable encryption and tokenisation as BlindBox. Therefore, the comparison with BlindBox can also demonstrate our advantages to the above work. The source code of [6, 19] is not publicly available, so we only compare our results with the one reported in their paper. We note that the test machine of SEST has similar capabilities as ours, while BlindBox is evaluated on a much better machine (Intel Xeon E5-2650 2.6GHz, 128GB RAM).

In Table III, we compare the theoretical performance from three perspectives of the listed works, i.e., the size of encrypted patterns, the size of the encrypted traffic ciphertext sending to the middlebox, and the inspection time on the middlebox. The encrypted pattern size mainly affects the storage consumption of the middlebox. Although our scheme has the largest storage overhead, it is still a moderate cost to a cloud server, as mentioned in the performance evaluation part. On the other hand, the encrypted traffic size of our proposed scheme is much smaller than the other two schemes; it implies that our middlebox can save enormous bandwidth comparing with [6, 19]. In terms of the inspection time, all schemes are linear in the length of the packet from the complexity view. Nonetheless, the inspection time of our middlebox is comparable to BlindBox: both of them achieve a microsecond-level inspection delay because the inspection using the SHVE scheme is based on ultra-fast operations, i.e., XOR and , which is only slightly slower than the index access operations in BlindBox. However, the inspection delay of SEST is larger because it relies on cryptographic pairing, which can take a millisecond for each pairing.

We report the performance comparison over real-world data in Table IV. We encrypt a 1500-byte packet as the encrypted traffic and use Snort ruleset to inspect the traffic on the middlebox. The result shows that our client only sends bytes to the middlebox to inspect the given packet, which is - times smaller than [6, 19]. When inspecting a 100-byte pattern in the ruleset, although SEST [6] and our middlebox leverage the linear scan to inspect the packet, SEST needs ms to finish the inspection as it is based on the public-key cryptographic scheme, which is very slow in practice, while our middlebox based on SHVE only needs s. As reported in BlindBox [19], inspecting one rule against a packet only requires s. Note that our inspection delay is also in the microsecond-level, which is negligible in real-world scenarios. Also, the testbed machine they used is much better than ours.

Deployment cost comparison. To further illustrate the practicality of our middlebox, we estimate the deployment cost of our scheme and the representative (i.e., BlindBox [19]) of tokenisation-based approaches [19, 28, 15].

In this cost estimation, we assume that the enterprise deploys a 7/24 pattern matching middlebox on AWS to examine all its traffic. In particular, the enterprise hires a c5.2xlarge EC2 instance (8 cores, 16 GB RAM) to host the middlebox module. Note that this instance has sufficient memory for the proposed middlebox since the previous evaluation shows that GB is enough to store the encrypted pattern list and filter generated from ETOpen ruleset. To have a consistent and stable network connection with higher bandwidth and throughput, the enterprise connects its network to the EC2 instance through AWS Direct Connect [2]. However, due to the traffic size under BlindBox is times larger than our middlebox, BlindBox needs higher network capacity in order to achieve similar performance as our middlebox. For instance, if our middlebox requires Gbps bandwidth to guarantee a low delay, then BlindBox should use the Gbps plan instead.

We refer to the pricing information in the U.S. East region (Virginia) to compute the price, and the monthly cost estimation result according to the above assumptions is listed in Table V. The result shows that even though the instance cost is the same for BlindBox and our middlebox, BlindBox has to pay more ( versus ) to get the same network performance as ours. In total, the monthly cost of deploying our middlebox in AWS only needs while BlindBox takes , which indicates a extra cost.

Scheme Instance Network Total
Blindbox [19] ( Gb bandwidth)
Our middlebox ( Gb bandwidth)
TABLE V: Monthly cost estimation between the tokenisation-based middleboxes and the proposed middlebox with ETOpen ruleset under AWS pricing information.

Vii Conclusion

In this paper, we design a system that allows outsourced middleboxes to perform pattern matching over encrypted traffic without revealing both traffic content and patterns. We first design a customised SHVE scheme (SHVE+) and then build an encrypted pattern matching protocol based on SHVE+ to protect pattern and network traffic during the pattern matching process. Next, we design a secure filtering protocol that can quickly find the starting positions for each possible match, which improve the pattern matching process further. Our system is implemented as a prototype, and our evaluation on real-world ruleset and traffic dump illustrates its advantages in terms of bandwidth, inspection delay and throughput.

References

  • [1] H. J. Asghar, L. Melis, C. Soldani, E. De Cristofaro, M. A. Kaafar, and L. Mathy (2016) Splitbox: Toward Efficient Private Network Function Virtualization. In HotMIddlebox’16, Cited by: §I, TABLE I, §II, §II, §IV-A.
  • [2] AWS (2019) AWS Direct Connect. Note: https://aws.amazon.com/directconnect/ [online] Cited by: §VI.
  • [3] M. Chase and E. Shen (2015) Substring-Searchable Symmetric Encryption. Proceedings on Privacy Enhancing Technologies 2015 (2), pp. 263–281. Cited by: §II.
  • [4] B. Choi, J. Chae, M. Jamshed, K. Park, and D. Han (2016) DFC: Accelerating String Pattern Matching for Network Applications. In USENIX NSDI’16, Cited by: §I, §IV-B, §IV-C, §IV-C.
  • [5] R. Curtmola, J.A. Garay, S. Kamara, and R. Ostrovsky (2011) Searchable Symmetric Encryption: Improved Definitions and Efficient Constructions. Journal of Computer Security 19 (5), pp. 895–934. Cited by: §I.
  • [6] N. Desmoulins, P. Fouque, C. Onete, and O. Sanders (2018) Pattern Matching on Encrypted Streams. In ASIACRYPT’18, Cited by: 5th item, §I, TABLE I, §II, §II, TABLE III, TABLE IV, §VI, §VI, §VI, §VI.
  • [7] H. Duan et al. (2019) LightBox: Full-stack Protected Stateful Middlebox at Lightning Speed. In ACM CCS’19, Cited by: §I, §I, §II.
  • [8] J. Fan, C. Guan, K. Ren, Y. Cui, and C. Qiao (2017) SPABox: Safeguarding Privacy During Deep Packet Inspection at A Middlebox. IEEE/ACM Transactions on Networking 25 (6), pp. 3753–3766. Cited by: §I, §I, §I, TABLE I, §II, §II, §III-B, §V.
  • [9] Y. Guo, C. Wang, X. Yuan, and X. Jia (2018) Enabling Privacy-Preserving Header Matching for Outsourced Middleboxes. In IEEE/ACM IWQoS’18, Cited by: §I, §II, §V.
  • [10] F. Hahn, N. Loza, and F. Kerschbaum (2018) Practical and Secure Substring Search. In ACM SIGMOD’18, Cited by: §II.
  • [11] J. Han, S. Kim, J. Ha, and D. Han (2017) SGX-Box: Enabling Visibility on Encrypted Traffic using A Secure Middlebox Module. In APNet’17, Cited by: §I, §I, §II.
  • [12] V. Iovino and G. Persiano (2008) Hidden-Vector Encryption with Groups of Prime Order. In Pairing’08, Cited by: §I, §III-C.
  • [13] P. Kocher et al. (2019) Spectre Attacks: Exploiting Speculative Execution. In IEEE S&P’19, Cited by: §I, §II.
  • [14] S. Lai et al. (2018) Result Pattern Hiding Searchable Encryption for Conjunctive Queries. In ACM CCS’18, Cited by: §I, §I, §III-C, §III-C, §IV-A, §IV-A, §V, §V, §V.
  • [15] C. Lan, J. Sherry, R. A. Popa, S. Ratnasamy, and Z. Liu (2016) Embark: Securely Outsourcing Middleboxes to the Cloud. In USENIX NSDI’16, Cited by: §I, §I, §I, TABLE I, §II, §II, §III-A, §VI, §VI, footnote 1.
  • [16] N. Cranford (2017) Verizon Brings Virtual Network Services to Amazon Cloud. Note: https://www.rcrwireless.com/20170816/verizon-brings-virtual-network-services-to-amazon-cloud-tag27 [online] Cited by: §I.
  • [17] R. Poddar, C. Lan, R. A. Popa, and S. Ratnasamy (2018) Safebricks: Shielding Network Functions in the Cloud. In USENIX NSDI’18, Cited by: §I, §I, §II.
  • [18] J. Sherry et al. (2012) Making Middleboxes Someone Else’s Problem: Network Processing as A Cloud Service. In ACM SIGCOMM’12, Cited by: §I, §V.
  • [19] J. Sherry, C. Lan, R. A. Popa, and S. Ratnasamy (2015) Blindbox: Deep Packet Inspection over Encrypted Traffic. In ACM SIGCOMM’15, Cited by: 5th item, §I, §I, §I, §I, §I, TABLE I, §II, §II, §III-A, §III-A, §III-B, §III-B, §IV-B, §V, TABLE III, TABLE IV, TABLE V, §VI, §VI, §VI, §VI, §VI, §VI.
  • [20] Snort (2019) Snort Community Ruleset. Note: https://www.snort.org/downloads/
    #rule-downloads [online]
    Cited by: §III-A, §IV-B, §VI.
  • [21] C. Stylianopoulos, M. Almgren, O. Landsiedel, and M. Papatriantafilou (2017) Multiple Pattern Matching for Network Security Applications: Acceleration through Vectorization. In IEEE ICCP’17, Cited by: §I, §IV-B, §IV-C.
  • [22] N. Sultana, M. Kohlweiss, and A. W. Moore (2016) Light at the Middle of the Tunnel: Middleboxes for Selective Disclosure of Network Monitoring to Distrusted Parties. In HotMIddlebox’16, Cited by: §I.
  • [23] B. Trach et al. (2018) ShieldBox: Secure Middleboxes using Shielded Execution. In SOSR’18, Cited by: §I, §I, §II.
  • [24] J. Van Bulck et al. (2018) Foreshadow: Extracting the Keys to the Intel SGX Kingdom with Transient Out-of-Order Execution. In USENIX Security’18, Cited by: §I, §II.
  • [25] S. Van Schaik et al. (2019) RIDL: Rogue In-Flight Data Load. In IEEE S&P’19, Cited by: §I, §II.
  • [26] C. Wang, X. Yuan, Y. Cui, and K. Ren (2017) Toward Secure Outsourced Middlebox Services: Practices, Challenges, and Beyond. IEEE Network 32 (1), pp. 166–171. Cited by: §I, §III-A.
  • [27] X. Yuan, H. Duan, and C. Wang (2016) Bringing Execution Assurances of Pattern Matching in Outsourced Middleboxes. In IEEE ICNP’16, Cited by: §III-A, §VI, footnote 1.
  • [28] X. Yuan, X. Wang, J. Lin, and C. Wang (2016) Privacy-Preserving Deep Packet Inspection in Outsourced Middleboxes. In IEEE INFOCOM’16, Cited by: §I, §I, §I, §I, §I, §I, TABLE I, §II, §II, §III-A, §III-A, §III-B, §IV-A, §IV-B, §IV-B, §V, §V, §VI, §VI.
  • [29] Z. Zhou and T. Benson (2015) Towards a Safe Playground for HTTPS and Middle Boxes with QoS2. In HotMIddlebox’15, Cited by: §I.