On Privacy Protection of Latent Dirichlet Allocation Model Training

06/04/2019
by   Fangyuan Zhao, et al.
0

Latent Dirichlet Allocation (LDA) is a popular topic modeling technique for discovery of hidden semantic architecture of text datasets, and plays a fundamental role in many machine learning applications. However, like many other machine learning algorithms, the process of training a LDA model may leak the sensitive information of the training datasets and bring significant privacy risks. To mitigate the privacy issues in LDA, we focus on studying privacy-preserving algorithms of LDA model training in this paper. In particular, we first develop a privacy monitoring algorithm to investigate the privacy guarantee obtained from the inherent randomness of the Collapsed Gibbs Sampling (CGS) process in a typical LDA training algorithm on centralized curated datasets. Then, we further propose a locally private LDA training algorithm on crowdsourced data to provide local differential privacy for individual data contributors. The experimental results on real-world datasets demonstrate the effectiveness of our proposed algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/09/2020

Latent Dirichlet Allocation Model Training with Differential Privacy

Latent Dirichlet Allocation (LDA) is a popular topic modeling technique ...
research
10/22/2015

A 'Gibbs-Newton' Technique for Enhanced Inference of Multivariate Polya Parameters and Topic Models

Hyper-parameters play a major role in the learning and inference process...
research
10/05/2016

Decentralized Topic Modelling with Latent Dirichlet Allocation

Privacy preserving networks can be modelled as decentralized networks (e...
research
03/13/2018

CuLDA_CGS: Solving Large-scale LDA Problems on GPUs

Latent Dirichlet Allocation(LDA) is a popular topic model. Given the fac...
research
12/09/2020

EvaLDA: Efficient Evasion Attacks Towards Latent Dirichlet Allocation

As one of the most powerful topic models, Latent Dirichlet Allocation (L...
research
08/23/2018

Latent Dirichlet Allocation for Internet Price War

Internet market makers are always facing intense competitive environment...
research
08/07/2019

The Hitchhiker's Guide to LDA

Latent Dirichlet Allocation (LDA) model is a famous model in the topic m...

Please sign up or login with your details

Forgot password? Click here to reset