Privacy Preserving Multi-Server k-means Computation over Horizontally Partitioned Data

08/11/2018
by   Riddhi Ghosal, et al.
0

The k-means clustering is one of the most popular clustering algorithms in data mining. Recently a lot of research has been concentrated on the algorithm when the dataset is divided into multiple parties or when the dataset is too large to be handled by the data owner. In the latter case, usually some servers are hired to perform the task of clustering. The dataset is divided by the data owner among the servers who together perform the k-means and return the cluster labels to the owner. The major challenge in this method is to prevent the servers from gaining substantial information about the actual data of the owner. Several algorithms have been designed in the past that provide cryptographic solutions to perform privacy preserving k-means. We provide a new method to perform k-means over a large set using multiple servers. Our technique avoids heavy cryptographic computations and instead we use a simple randomization technique to preserve the privacy of the data. The k-means computed has exactly the same efficiency and accuracy as the k-means computed over the original dataset without any randomization. We argue that our algorithm is secure against honest but curious and passive adversary.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/22/2020

Privacy Preserving K-Means Clustering: A Secure Multi-Party Computation Approach

Knowledge discovery is one of the main goals of Artificial Intelligence....
research
09/01/2020

POSEIDON: Privacy-Preserving Federated Neural Network Learning

In this paper, we address the problem of privacy-preserving training and...
research
02/26/2020

Privacy-Preserving Distributed Clustering for Electrical Load Profiling

Electrical load profiling supports retailers and distribution network op...
research
10/13/2021

3LSAA: A Secure And Privacy-preserving Zero-knowledge-based Data-sharing Approach Under An Untrusted Environment

As data collection and analysis become critical functions for many cloud...
research
04/09/2019

Privacy-Preserving Hierarchical Clustering: Formal Security and Efficient Approximation

Machine Learning (ML) is widely used for predictive tasks in a number of...
research
07/22/2023

Towards Vertical Privacy-Preserving Symbolic Regression via Secure Multiparty Computation

Symbolic Regression is a powerful data-driven technique that searches fo...
research
07/17/2020

Privacy-Preserving Distributed Learning in the Analog Domain

We consider the critical problem of distributed learning over data while...

Please sign up or login with your details

Forgot password? Click here to reset