Multi-owner Secure Encrypted Search Using Searching Adversarial Networks

08/07/2019 ∙ by Kai Chen, et al. ∙ Nanjing University 0

Searchable symmetric encryption (SSE) for multi-owner model draws much attention as it enables data users to perform searches over encrypted cloud data outsourced by data owners. However, implementing secure and precise query, efficient search and flexible dynamic system maintenance at the same time in SSE remains a challenge. To address this, this paper proposes secure and efficient multi-keyword ranked search over encrypted cloud data for multi-owner model based on searching adversarial networks. We exploit searching adversarial networks to achieve optimal pseudo-keyword padding, and obtain the optimal game equilibrium for query precision and privacy protection strength. Maximum likelihood search balanced tree is generated by probabilistic learning, which achieves efficient search and brings the computational complexity close to O( N). In addition, we enable flexible dynamic system maintenance with balanced index forest that makes full use of distributed computing. Compared with previous works, our solution maintains query precision above 95 overhead on computation, communication and storage.



There are no comments yet.


page 4

page 7

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Background and Motivation.

In cloud computing, searchable symmetric encryption (SSE) for multiple data owners model (multi-owner model, MOD) draws much attention as it enables multiple data users (clients) to perform searches over encrypted cloud data outsourced by multiple data owners (authorities). Unfortunately, none of the previously-known traditional SSE scheme for MOD achieve secure and precise query, efficient search and flexible dynamic system maintenance at the same time [9]. This severely limits the practical value of SSE and decreases its chance of deployment in real-world cloud storage systems.

Related Work and Challenge.

SSE has been continuously developed since it was proposed by Song et al. [12], and multi-keyword ranked search over encrypted cloud data scheme is recognized as outstanding [9]. Cao et al. [1] first proposed privacy-preserving multi-keyword ranked search scheme (MRSE), and established strict privacy requirements. They first employed asymmetric scalar-product preserving encryption (ASPE) approach [15]

to obtain the similarity scores of the query vector and the index vector, so that the cloud server can return the

top-k documents. However, they did not provide the optimal balance of query precision and privacy protection strength. For better query precision and query speed, Sun et al. [13] proposed MTS with the TFIDF keyword weight model, where the keyword weight depends on the frequency of the keyword in the document and the ratio of the documents containing this keyword to the total documents. This means that TFIDF cannot handle the differences between data from different owners in MOD, since each owner’s data is different and there is no uniform standard to measure keyword weights. Based on MRSE, Li et al. [8] proposed a better solution (MKQE), where a new index construction algorithm and trapdoor generation algorithm are designed to realize the dynamic expansion of the keyword dictionary and improve the system performance. However, their scheme only realized the linean search efficiency. Xia et al. [16] provided EDMRS to support flexible dynamic operation by using balanced index tree that builded following the bottom-up strategy and “greedy” method, and they used parallel computing to improve search efficiency. However, when migrating to MOD, ordinary balanced binary tree they employed is not optimistic [6]. It is frustrating that the above solutions only support SSE for single data owner model. Due to the diverse demand of the application scenario, such as emerging authorised searchable technology for multi-client (authority) encrypted medical databases that focuses on privacy protection [18, 19], research on SSE for MOD is increasingly active. Guo et al. [6]

proposed MKRS_MO for MOD, they designed a heuristic weight generation algorithm based on the relationships among keywords, documents and owners (KDO). They considered the correlation among documents and the impact of documents’ quality on search results, so that the KDO is more suitable for MOD than the TF

IDF. However, they ignored the secure search scheme in known background model [1](a threat model that measures the ability of “honest but curious” cloud server [14, 20] to evaluate private data and the risk of revealing private information in SSE system). Currently, SSE for MOD is still these challenges: (1) comprehensively optimizing query precision and privacy protection is difficult; (2)

a large amount of different data from multiple data owners make the data features sparse, and the calculation of high-dimensional vectors can cause “curse of dimensionality”;

(3) frequent updates of data challenge the scalability of dynamic system maintenance.

Our Contribution.

This paper proposes secure and efficient multi-keyword ranked search over encrypted cloud data for multi-owner model based on searching adversarial networks (MRSM_SAN). Specifically, including the following three techniques: (1) optimal pseudo-keyword padding based on searching adversarial networks (SAN): To improve the privacy protection strength of SSE is a top priority. Padding random noise into the data [1, 8, 17] is a current popular method designed to interfere with the analysis and evaluation from cloud server, which protects the document content and keyword information better. However, such an operation will reduce the query precision [1]. In response to this, we creatively use adversarial learning [4] to obtain the

optimal probability distribution

for controlling pseudo-keyword padding and the optimal game equilibrium for the query precision and the privacy protection strength. This makes query precision exceeds 95% while ensuring adequate privacy protection, which is better than traditional SSE [1, 6, 8, 13, 16]; (2) efficient search based on maximum likelihood search balanced tree (MLSB-Tree):

The construction of the index tree is the biggest factor affecting the search efficiency. If the leaf nodes of the index tree are sorted by maximum probability (the ranking of the index vectors from high to low depends on the probability of being searched), the computational complexity will be close to

 [7]. Probabilistic learning is employed to obtain MLSB-Tree, which is ordered in a maximum probability. The experimental evaluation shows that MLSB-Tree-based search is faster and more stable compare with related works [6, 16]; (3) flexible dynamic system maintenance based on balanced index forest (BIF): Using unsupervised learning [3, 10]

to design a fast index clustering algorithm to classify all indexes into multiple index partitions, and a corresponding balanced index tree is constructed for each index partition, thus all index trees form BIF. Owing to BIF is distributed, it only needs to maintain the corresponding index partition without touching all indexes in dynamic system maintenance, which improves the efficiency of index update operations and introduces low overhead on the computation, communication and storage. In summary, MRSM_SAN increases the possibility of deploying dynamic SSE in real-world cloud storage systems.

Organization and Version Notes.

Section 2 describes scheme. Section 3 conducts experimental evaluation. Section 4 discusses our solution. Compared with the preliminary version [2], this paper adds algorithms, enhances security analysis, and conducts more in-depth experimental analysis of the proposed scheme.

2 Secure and Efficient MRSM_SAN

2.1 System Model

The proposed system model consists of four parties, is depicted in Fig. 1. Data owners () are responsible for constructing searchable index, encrypting data and sending them to cloud server or trusted proxy; Data users () are consumers of cloud services. Based on attribute-based encryption [5], once authorize attributes related to the retrieved data, can retrieve the corresponding data; Trusted proxy () is responsible for index processing, query and trapdoor generation, user authority authentication; Cloud server () provides cloud service, including running authorized access controls, performing searches over encrypted cloud data based on query requests, and returning top-k documents to . is considered“honest but curious” [14, 20], so that it is necessary to provide a secure search scheme to protect privacy. Our goal is to protect index privacy, query privacy and keyword privacy in dynamic SSE.

Figure 1: The basic architecture of MRSM_SAN

2.2 MRSM_SAN Framework


Based on index clustering results ( index partitions) and privacy requirements in known background model [1], determines the size of sub-dictionary , the number of pseudo-keyword, sets the parameter . Thus = {,…,}, = {,…,}, = {,…,}.


generates key = {,…,}, where = {, , }, and are two -dimensional invertible matrices, is a random -dimensional vector. Symmetric key = {, , }.


For dynamic search [8, 16], if new keywords are added into the -th sub-dictionary, generates a new key = {, , }, where and are two -dimensional invertible matrices, is a new random -dimensional vector.


To realize secure search in known background model [1], pads pseudo-keyword into weighted index (associated with document ) to obtain secure index , and uses and to generate BIF = {,…,} and encrypted BIF = {,…,}. Finally, sends to .


send query requests (keywords and their weights) and attribute identification to . generates query = {,…,} and generates trapdoor = {,…,} using . Finally, sends to .


sends query information to and specifies index partitions to be queried. performs searches and retrieves top-k documents.

2.3 Algorithms for Scheme

Input: Document set , keyword dictionary
Output: Binary index set

1:for  to  do
2:       Based on Vector Space Model [11] and keyword dictionary , generates binary index = {,…,} for document = {,…,}, where is a binary index vector.
3:       return Binary index Index
Algorithm 1 Binary Index Generation

Input: Binary index (vector) from , where = {,…,}, = {,…,}.
Output: index partitions, sub-dictionaries and new binary index = .

1:Local Clustering: For = {,…,}, uses

Twin Support Vector Machine

 [3] to classify the index vectors in into 2 clusters (-th and -th initial index partition) and obtain the representative vectors for the -th and the -th initial index partition. return initial index partitions and their representative vectors.
2:Global Clustering: uses

Manhattan Frequency k-Means

 [10] algorithm to group all initial index partitions (representative vectors) into final index partitions. return index partitions.
3:Keyword Dictionary Segmentation: According to the obtained index partitions, the keyword dictionary is divided into sub-dictionaries correspondingly, where = . obtains new binary index , where , is a -dimensional vector. Delete “public redundancy zero element” of all index vectors in the same index partition, thus the length of the index vector becomes shorter than before. return sub-dictionaries and new binary index for index partitions.
Algorithm 2 Fast Index Clustering & Keyword Dictionary Segmentation

Input: Binary index for index partitions.
Output: Secure weighted index for index partitions, i.e. the data type is floating point.

1:Correlativity Matrix Generation: Using the corpus to determine the semantic relationship between different keywords and obtain the correlativity matrix (symmetric matrix).
2:Weight Generation: Based on KDO [6], construct the average keyword popularity about . Specifically, calculate of with equation “”, where the operator denotes the product of two vectors corresponding elements, = (,…,), if , = , otherwise = 0 (where is the number of documents contain keyword that in -th sub-dictionary , ). Calculate the raw weight information for , = , where = (,…,).
3:Normalized Processing: Obtain the maximum raw weight of every keyword among different , = . Based on the , calculate =
4:Weighted Index Generation: obtains weighted index vector with “”, where associated with document corresponds to the -th index partition ().
5:Secure Weighted Index Generation: pads pseudo-keyword into in -th index partition to obtain -dimensional secure weighted index with high privacy protection strength [16, 17].
Algorithm 3 Secure Weighted Index Generation

Input: Secure weighted index for index partitions, randomly generated query vector .
Output: MLSB-Tree for index partitions and BIF for all indexes belong to .

1:for  to  do
2:       for  to  do
3:             Based on probabilistic learning, calculates “ = ”; Then, sorts according to ; Finally, follows the bottom-up strategy and generates MLSB-Tree (balanced tree) with greedy method [16].        
4:       return MLSB-Tree .
5:return BIF = {,…,}.
Algorithm 4 MLSB-Tree and BIF Generation

Input: BIF = {} and key , where = .
Output: Encrypted BIF = {,…,}.

1:for  to  do
2:        encrypts MLSB-Tree with the secret key to obtain encrypted MLSB-Tree
3:       for  to  do
4:              “splits” vector of (node of ) into two random vectors ,
5:             if  then
6:                     = =
7:             else
8:                    if  then
9:                           is a random value ,                                  
10:              encrypts with reversible matrices and to obtain “ = {, } = ”, where and are -length vectors        
11:       return Encrypted MLSB-Tree .
12:return Encrypted Encrypted BIF = {,…,}.
Algorithm 5 Encrypted MLSB-Tree and Encrypted BIF Generation

Trapdoor Generation
Input: Query vector .
Output: Trapdoor .

1:for  to  do
2:        “splits” query vector into two random vectors and .
3:       if  then
4:              is a random value , .
5:       else
6:             if  then
7:                    , where .                     
8:        encrypts and with reversible matrices and to obtain trapdoor “ = = ”.

Input: Query.
Output: top-k documents.

1:for  to  do
2:       if  is the specified index tree then
3:             if  is a non-leaf node then
4:                    if -th score then
5:                          GDFS(.high-child)
6:                          GDFS(.low-child)
7:                    else
8:                          return                     
9:             else
10:                    if -th score then
11:                          Update -th score for -th index tree and the ranked search result list for .                                         
12:return The final top-k documents for
Algorithm 6 Trapdoor Generation and GDFS
  1. Binary Index Generation: uses algorithm 1 to generate the binary index (vector) for the document , and sends to .

  2. Fast Index Clustering & Keyword Dictionary Segmentation: We employ algorithm 2 to solve “curse of dimensionality” issue in computing.

  3. Weighted Index Generation: exploits the KDO weight model [6] to generate the weighted index, as shown in algorithm 3.

  4. MLSB-Tree and BIF Generation: uses algorithm 4 to generate MLSB-Tree and BIF = .

  5. Encrypted MLSB-Tree and Encrypted BIF Generation. encrypts using algorithm 5 and sends encrypted to . and are isomorphic (i.e.[16]. Thus, the search capability of tree is still well maintained.

  6. Trapdoor Generation. Based on query request from , generates = {,…,} and = {,…,} using algorithm 6, and sends to .

  7. Search Process. (1) Query Preparation: send query request and attribute identifications to . If validating queries are valid, generates trapdoors and initiates search queries to . If access control passes, performs searches and returns top-k documents to . Otherwise refuses to query. (2) Calculate Matching Score for Query on MLBS-Tree :

    (3) Search Algorithm for BIF: the greedy depth-first search (GDFS) algorithm for BIF as shown in algorithm 6.

2.4 Security Improvement and Analysis

Adversarial Learning.

Padding random noise into the data [1, 8, 17] is a popular method to improve security. However, pseudo-keyword padding that follows different probability distributions will reduce query precision to varying degrees [1, 8]. Therefore, it is necessary to optimize the probability distribution that controls pseudo-keyword padding. To address this, adversarial learning [4] for optimal pseudo-keyword padding is proposed. As shown in Fig. 2. Searcher Network S() : The search result (SR) is generated by taking the random noise (, is the object probability distribution ) as an input and performing a search, and supplies SR to the discriminator network . Discriminator Network D(): The input has an accurate search result (ASR) or SR and attempts to predict whether the current input is an ASR or a SR. One of the inputs is obtained from the real search result set distribution , and then one or two are solved. Classify problems and generate scalars ranging from 0 to 1. Finally, in order to reach a balance point which is the best point of the minimax game, generates SR, and (considered as adversary) considers the probability that produces ASR is , i.e. it is difficult to distinguish between padding and without-padding, thus it can achieve effective security [17].

Figure 2: Searching Adversarial Networks

Similar to GAN [4], to learn the searcher’s distribution over data , we define a prior on input noise variables , then represent a mapping to data space as , where is a differentiable function represented by a multi-layer perception with parameters . We also define a second multi-layer perception that outputs a single scalar. represents the probability that came from the data rather than . We train to maximize the probability of assigning the correct label to both training examples and samples from . We simultaneously train to minimize : In other words, and play the following two-player minimax game with value function :

Security Analysis.

Index confidentiality and query confidentiality: ASPE approach [15] is widely used to generate secure index/query in privacy-preserving keyword search schemes [1, 6, 8, 13, 16] and its security has been proven. Since the index/query vector is randomly generated and search queries return only the secure inner product [1] computation results (non-zero) of encrypted index and trapdoor, thus is difficult to accurately evaluate the keywords including in the query and matching top-k documents. Moreover, confidentiality is further enhanced as the optimal pseudo-keyword padding is difficult to distinguish and the transformation matrices are harder to figure out [15].

Query unlinkability: By introducing the random value (padding pseudo-keyword), the same search requests will generate different query vectors and receive different relevance score distributions [1, 16]. The optimal game equilibrium for precision and privacy is obtain by adversarial learning, which further improves query unlinkability. Meanwhile, SAN are designed to protect access pattern [17], which makes it difficult for to judge whether the retrieved ranked search results come from the same request.

Keyword privacy: According to the security analysis in [16], for -th index partition, aiming to maximize the randomness of the relevance score distribution, it is necessary to obtain as many different as possible (where ; in [16], ). Assuming each index vector has at least different choices, the probability of two share the same value is less than . If we set each (Uniform distribution

), according to the central limit theorem,

(Normal distribution), where , . Therefore, it can set

and balance precision and privacy by adjusting the variance

in real-world application. In fact, when (floating point number), SAN can achieve stronger privacy protection.

3 Experimental Evaluation

We implemented the proposed scheme using Python in Windows 10 operation system with Intel Core i5 Processor 2.40GHz and evaluated its performance on a real-world data set (academic conference publications provided by IEEE xplore, including 20,000 papers and 80,000 different keywords, 400 academic conferences were randomly selected as data owners ). All experimental results represent the average of 1000 trials.

Optimal Pseudo-keyword Padding.

The parameters controlling the probability distribution (using SAN to find or approximate) are adjusted to find the optimal game equilibrium for query precision (denoted as ) and rank privacy protection (denoted as ) (where , , and are respectively the number of real top-k documents and the rank number of document in the retrieved documents, and is document’s real rank number in the whole ranked results [1]). We choose 95% query precision and 80% rank privacy protection as benchmarks to get the game equilibrium score calculation formula: (objective function to be optimized). As shown in Fig. 3, we find the optimal game equilibrium () at , , . The corresponding query precision are: 98%, 97%, 93%. The corresponding rank privacy protection are: 78%,79%,84%. Therefore, we can choose the best value of to achieve optimal pseudo-keyword padding to satisfy query precision requirement and maximize rank privacy protection.

Figure 3:

With different choice of standard deviation

for the random variable

. {(a) query precision(%) and rank privacy protection(%); (b) game equilibrium (score). explanation for : When is greater than 0.2, the weight of the pseudo-keyword may be greater than 1, which violates our weight setting (between 0 and 1), so we only need to find the best game equilibrium point when .)
Figure 4: Time cost of query for 1000 random searches in 500 sizes of data set. (a) Comparison of tree-based search efficiency. Since the query is random, the search time fluctuates, which causes the curves in the graph to have intersections; (b) Comparison of MLSB-Tree and BIF search efficiency.

Search Efficiency of MLSB-Tree.

Search efficiency is mainly described by query speed, and our experimental objects are index trees that are structured with different strategy: EDMRS [16] (ordinary balanced binary tree), MKRS_MO [6] (grouped balanced binary tree), MRSM_SAN(globally grouped balanced binary tree) and MRSM_SAN_MLSB-Tree. We first randomly generate 1000 query vectors, then perform search on each index tree respectively, finally take the results of 20 repeated experiments for analysis. As shown in Fig. 4, the query speed and query stability based on MLSB-Tree are better than other index trees. Compared with EDMRS and MKRS_MO, query speed increased by 21.72% and 17.69%. In terms of stability, MLSB-Tree is better than other index trees. (variance of search time(s): 0.0515 [6], 0.0193 [16], 0.0061[MLSB-Tree])

Search Efficiency of BIF.

As shown in Fig. 4, query speed of MRSM_SAN (with MLSB-Tree and BIF) is significantly higher than MRSM_SAN (only with MLSB-Tree), and the search efficiency is improved by 5 times and the stability increase too. This is just the experimental result of 500 documents set with the 4000-dimensional keyword dictionary. After the index clustering operation, the keyword dictionary is divided into four sub-dictionaries with a dimension of approximately 1000. As the amount of data increases, the dimension of the keyword dictionary will become extremely large, and the advantages of BIF will become more apparent. In our analytical experiments, the theoretical efficiency ratio before and after segmentation is: ,where denotes the number of index partitions after fast index clustering, and denotes the total number of documents. When the amount of data increases to 20,000, the total keyword dictionary dimension is as high as 80,000. If the keyword sub-dictionary dimension is 1000, the number of index partitions after fast index clustering is 80, the search efficiency will increase by more than 100 times (). This will bring huge benefits to large information systems, and our solutions can exchange huge returns with minimal computing resources.

Figure 5: Time cost of query for the same query keywords (10 keywords) in different sizes of data set.{(a) Experimental results show that our solution achieves near binary search efficiency and is superior to other existing comparison schemes. As the amount of data increases, our solution has a greater advantage. It is worth noting that this is just the search performance based on MLSB-Tree; (b) Comparison of MLSB-Tree and BIF. Based on experimental analysis, it concludes that when data volume grows exponentially data features become more sparse, if all index vectors only rely on an index tree to complete the search task, the computational complexity will be getting farther away from . Sparseness of data features makes the similarity between index vectors is mostly close to zero or even equal to zero, which brings trouble to the pairing of index vectors. Moreover, the construction of balanced index tree is not global order, so it is necessary to traverse many nodes in the search, which proves the limitation of balanced binary tree [6, 16]. We construct MLSB-Tree with maximum likelihood method and probabilistic learning. Interestingly the closer the number of random searches is to infinity, the higher the search efficiency of obtained index tree, this makes the computational complexity of search can converge to .}

Comparison of Search Efficiency (Larger Data Set).

The efficiency of MRSM_SAN (without BIF) and related works [1, 6, 8, 16] are show as Fig. 5, and the efficiency of MRSM_SAN(without BIF) and MRSM_SAN(with BIF) are show as Fig. 5. It is more notable that the maintenance cost of scheme based on BIF is much lower than the cost of scheme only based on a balanced index tree. When adding a new document to , we need to insert a new index vector (the node of the tree) in the index tree accordingly. If it is only based on an index tree, search complexity (search the location where the new index is inserted into the index tree) and update complexity (update the parent node corresponding to the new index) are both at least  [9], the total cost is (where denotes the total number of documents). But BIF is very different, because we group all index vectors into different index partitions and reduce their dimension. We assume that the number of index vectors in each index partition is equal, thus we need to spend the same update operation for each partition, which makes the cost is only and enables flexible dynamic system maintenance. Moreover, the increase in efficiency is positively correlated with the increase in data volume and data sparsity. For communication, compared with traditional SSE schemes [6, 13, 16], when the old index tree in the cloud is overwritten by the new index tree uploaded by /, our scheme only needs to update the specified index tree instead of the entire index forest. For storage, tree-based overhead is times forest-based.

4 Discussion

This paper proposes secure and efficient MRSM_SAN, and conducts in-depth security analysis and experimental evaluation. Creatively using adversarial learning to find optimal game equilibrium for query precision and privacy protection strength and combining traditional SSE with uncertain control theory, which opens a door for intelligent SSE. In addition, we propose MLSB-Tree, which generated by a sufficient amount of random searches and brings the computational complexity close to . It means that using probabilistic learning to optimize the query result is effective in an uncertain system (owner’s data and user’s queries are uncertain). Last but not least, we implement flexible dynamic system maintenance with BIF, which not only reduces the overhead of dynamic maintenance and makes full use of distributed computing, but also improves the search efficiency and achieves fine-grained search. This is beneficial to improve the availability, flexibility and efficiency of dynamic SSE system.


This work was supported by “the Fundamental Research Funds for the Central Universities” (No. 30918012204) and “the National Undergraduate Training Program for Innovation and Entrepreneurship” (Item number: 201810288061). NJUST graduate Scientific Research Training of ‘Hundred, Thousand and Ten Thousand’ Project “Research on Intelligent Searchable Encryption Technology”.


  • [1] N. Cao, C. Wang, M. Li, K. Ren, and W. Lou (2014) Privacy-preserving multi-keyword ranked search over encrypted cloud data. IEEE TPDS 25 (1), pp. 222–233. Cited by: §1, §1, item Setup:, item BuildIndex():, §2.4, §2.4, §2.4, §3, §3.
  • [2] K. Chen, Z. Lin, J. Wan, L. Xu, and C. Xu (2019) Multi-client secure encrypted search using searching adversarial networks. IACR Cryptology ePrint Archive 2019, pp. 900. Cited by: §1.
  • [3] S. Chen, J. Cao, Z. Huang, and C. Shen (2019) Entropy-based fuzzy twin bounded support vector machine for binary classification. IEEE Access 7, pp. 86555–86569. Cited by: §1, 1.
  • [4] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. C. Courville, and Y. Bengio (2014) Generative adversarial networks. CoRR abs/1406.2661 (). Cited by: §1, §2.4, §2.4.
  • [5] V. Goyal, O. Pandey, A. Sahai, and B. Waters (2006) Attribute-based encryption for fine-grained access control of encrypted data. In ACM CCS 2006., pp. 89–98. Cited by: §2.1.
  • [6] Z. Guo, H. Zhang, C. Sun, Q. Wen, and W. Li (2018) Secure multi-keyword ranked search over encrypted cloud data for multiple data owners. Journal of Systems and Software 137 (3), pp. 380–395. Cited by: §1, §1, item 3, §2.4, Figure 5, §3, §3, 2.
  • [7] D. E. Knuth (1998) The art of computer programming, volume iii, 2nd edition. Addison-Wesley. Cited by: §1.
  • [8] R. Li, Z. Xu, W. Kang, K. Yow, and C. Xu (2014) Efficient multi-keyword ranked query over encrypted data in cloud computing. FGCS 30, pp. 179–190. Cited by: §1, §1, item Extended-KeyGen():, §2.4, §2.4, §3.
  • [9] G. S. Poh, J. Chin, W. Yau, K. R. Choo, and M. S. Mohamad (2017) Searchable symmetric encryption: designs and challenges. ACM Comput. Surv. 50 (3), pp. 40:1–40:37. Cited by: §1, §1, §3.
  • [10] S. B. Salem, S. Naouali, and Z. Chtourou (2018) A fast and effective partitional clustering algorithm for large categorical datasets using a k-means based approach. Computers & Electrical Engineering 68, pp. 463–483. Cited by: §1, 2.
  • [11] G. Salton, A. Wong, and C. Yang (1975) A vector space model for automatic indexing. Commun. ACM 18 (11), pp. 613–620. Cited by: 2.
  • [12] D. X. Song, D. A. Wagner, and A. Perrig (2000) Practical techniques for searches on encrypted data. In IEEE S & P 2000., pp. 44–55. Cited by: §1.
  • [13] W. Sun, B. Wang, N. Cao, M. Li, W. Lou, Y. T. Hou, and H. Li (2014) Verifiable privacy-preserving multi-keyword text search in the cloud supporting similarity-based ranking. IEEE TPDS 25 (11), pp. 3025–3035. Cited by: §1, §1, §2.4, §3.
  • [14] C. Wang, Q. Wang, K. Ren, and W. Lou (2010) Privacy-preserving public auditing for data storage security in cloud computing. In IEEE INFOCOM 2010., pp. 525–533. Cited by: §1, §2.1.
  • [15] W. K. Wong, D. W. Cheung, B. Kao, and N. Mamoulis (2009)

    Secure knn computation on encrypted databases

    In ACM SIGMOD 2009., pp. 139–152. Cited by: §1, §2.4.
  • [16] Z. Xia, X. Wang, X. Sun, and Q. Wang (2016) A secure and dynamic multi-keyword ranked search scheme over encrypted cloud data. IEEE TPDS 27 (2), pp. 340–352. Cited by: §1, §1, item Extended-KeyGen():, item 5, §2.4, §2.4, §2.4, Figure 5, §3, §3, 5, 3.
  • [17] L. Xu, X. Yuan, C. Wang, Q. Wang, and C. Xu (2019) Hardening database padding for searchable encryption. In IEEE INFOCOM 2019., pp. 2503–2511. Cited by: §1, §2.4, §2.4, 5.
  • [18] Lei. Xu, S. Sun, X. Yuan, J. K. Liu, C. Zuo, and Chungen. Xu (2019) Enabling authorized encrypted search for multi-authority medical databases. IEEE TETC. External Links: Document Cited by: §1.
  • [19] Lei. Xu, Chungen. Xu, J. K. Liu, C. Zuo, and P. Zhang (2019) Building a dynamic searchable encrypted medical database for multi-client. Inf. Sci.. External Links: Document Cited by: §1.
  • [20] S. Yu, C. Wang, K. Ren, and W. Lou (2010) Achieving secure, scalable, and fine-grained data access control in cloud computing. In IEEE INFOCOM 2010., pp. 534–542. Cited by: §1, §2.1.