Differentially Private k-Means Clustering with Guaranteed Convergence

by   Zhigang Lu, et al.

Iterative clustering algorithms help us to learn the insights behind the data. Unfortunately, this may allow adversaries to infer the privacy of individuals with some background knowledge. In the worst case, the adversaries know the centroids of an arbitrary iteration and the information of n-1 out of n items. To protect individual privacy against such an inference attack, preserving differential privacy (DP) for the iterative clustering algorithms has been extensively studied in the interactive settings. However, existing interactive differentially private clustering algorithms suffer from a non-convergence problem, i.e., these algorithms may not terminate without a predefined number of iterations. This problem severely impacts the clustering quality and the efficiency of a differentially private algorithm. To resolve this problem, in this paper, we propose a novel differentially private clustering framework in the interactive settings which controls the orientation of the movement of the centroids over the iterations to ensure the convergence by injecting DP noise in a selected area. We prove that, in the expected case, algorithm under our framework converges in at most twice the iterations of Lloyd's algorithm. We perform experimental evaluations on real-world datasets to show that our algorithm outperforms the state-of-the-art of the interactive differentially private clustering algorithms with guaranteed convergence and better clustering quality to meet the same DP requirement.



There are no comments yet.


page 1

page 2

page 3

page 4


Differentially Private ADMM Algorithms for Machine Learning

In this paper, we study efficient differentially private alternating dir...

Utility-efficient Differentially Private K-means Clustering based on Cluster Merging

Differential privacy is widely used in data analysis. State-of-the-art k...

Differentially Private Clustering via Maximum Coverage

This paper studies the problem of clustering in metric spaces while pres...

DP-EM: Differentially Private Expectation Maximization

The iterative nature of the expectation maximization (EM) algorithm pres...

Protect Edge Privacy in Path Publishing with Differential Privacy

Paths in a given network are a generalised form of time-serial chains in...

Differentially Private User-based Collaborative Filtering Recommendation Based on K-means Clustering

Collaborative filtering (CF) recommendation algorithms are well-known fo...

DP-SEP! Differentially Private Stochastic Expectation Propagation

We are interested in privatizing an approximate posterior inference algo...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.