k-Means SubClustering: A Differentially Private Algorithm with Improved Clustering Quality

by   Devvrat Joshi, et al.

In today's data-driven world, the sensitivity of information has been a significant concern. With this data and additional information on the person's background, one can easily infer an individual's private data. Many differentially private iterative algorithms have been proposed in interactive settings to protect an individual's privacy from these inference attacks. The existing approaches adapt the method to compute differentially private(DP) centroids by iterative Llyod's algorithm and perturbing the centroid with various DP mechanisms. These DP mechanisms do not guarantee convergence of differentially private iterative algorithms and degrade the quality of the cluster. Thus, in this work, we further extend the previous work on 'Differentially Private k-Means Clustering With Convergence Guarantee' by taking it as our baseline. The novelty of our approach is to sub-cluster the clusters and then select the centroid which has a higher probability of moving in the direction of the future centroid. At every Lloyd's step, the centroids are injected with the noise using the exponential DP mechanism. The results of the experiments indicate that our approach outperforms the current state-of-the-art method, i.e., the baseline algorithm, in terms of clustering quality while maintaining the same differential privacy requirements. The clustering quality significantly improved by 4.13 and 2.83 times than baseline for the Wine and Breast_Cancer dataset, respectively.


page 1

page 2

page 3

page 4


Differentially Private k-Means Clustering with Guaranteed Convergence

Iterative clustering algorithms help us to learn the insights behind the...

Utility-efficient Differentially Private K-means Clustering based on Cluster Merging

Differential privacy is widely used in data analysis. State-of-the-art k...

Achieving differential privacy for k-nearest neighbors based outlier detection by data partitioning

When applying outlier detection in settings where data is sensitive, mec...

Differentially Private Call Auctions and Market Impact

We propose and analyze differentially private (DP) mechanisms for call a...

Differentially Private Graph Learning via Sensitivity-Bounded Personalized PageRank

Personalized PageRank (PPR) is a fundamental tool in unsupervised learni...

DPOAD: Differentially Private Outsourcing of Anomaly Detection through Iterative Sensitivity Learning

Outsourcing anomaly detection to third-parties can allow data owners to ...

"Private Prediction Strikes Back!” Private Kernelized Nearest Neighbors with Individual Renyi Filter

Most existing approaches of differentially private (DP) machine learning...

Please sign up or login with your details

Forgot password? Click here to reset