ClustCrypt: Privacy-Preserving Clustering of Unstructured Big Data in the Cloud

08/14/2019
by   SM Zobaed, et al.
0

Security and confidentiality of big data stored in the cloud are important concerns for many organizations to adopt cloud services. One common approach to address the concerns is client-side encryption where data is encrypted on the client machine before being stored in the cloud. Having encrypted data in the cloud, however, limits the ability of data clustering, which is a crucial part of many data analytics applications, such as search systems. To overcome the limitation, in this paper, we present an approach named ClustCrypt for efficient topic-based clustering of encrypted unstructured big data in the cloud. ClustCrypt dynamically estimates the optimal number of clusters based on the statistical characteristics of encrypted data. It also provides clustering approach for encrypted data. We deploy ClustCrypt within the context of a secure cloud-based semantic search system (S3BD). Experimental results obtained from evaluating ClustCrypt on three datasets demonstrate on average 60 improvement on clusters' coherency. ClustCrypt also decreases the search-time overhead by up to 78

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/22/2020

Privacy-Preserving Clustering of Unstructured Big Data for Cloud-Based Enterprise Search Solutions

Cloud-based enterprise search services (e.g., Amazon Kendra) are enchant...
research
09/21/2018

S3BD: Secure Semantic Search over Encrypted Big Data in the Cloud

Cloud storage is a widely utilized service for both personal and enterpr...
research
08/10/2019

Edge Computing for User-Centric Secure Search on Cloud-Based Encrypted Big Data

Cloud service providers offer a low-cost and convenient solution to host...
research
01/22/2023

A Framework to Allow a Third Party to Watermark Numerical Data in an Encrypted Domain while Preserving its Statistical Properties

Watermarking data for source tracking applications by its owner can be u...
research
10/25/2012

A Biomimetic Approach Based on Immune Systems for Classification of Unstructured Data

In this paper we present the results of unstructured data clustering in ...
research
01/03/2023

AI-Driven Confidential Computing across Edge-to-Cloud Continuum

With the meteoric growth of technology, individuals and organizations ar...
research
05/11/2019

Encrypted Speech Recognition using Deep Polynomial Networks

The cloud-based speech recognition/API provides developers or enterprise...

Please sign up or login with your details

Forgot password? Click here to reset