Private Topic Modeling

09/14/2016
by   Mijung Park, et al.
0

We develop a privatised stochastic variational inference method for Latent Dirichlet Allocation (LDA). The iterative nature of stochastic variational inference presents challenges: multiple iterations are required to obtain accurate posterior distributions, yet each iteration increases the amount of noise that must be added to achieve a reasonable degree of privacy. We propose a practical algorithm that overcomes this challenge by combining: (1) A relaxed notion of the differential privacy, called concentrated differential privacy, which provides high probability bounds for cumulative privacy loss, which is well suited for iterative algorithms, rather than focusing on single-query loss; and (2) Privacy amplification resulting from subsampling of large-scale data. Focusing on conjugate exponential family models, in our private variational inference, all the posterior distributions will be privatised by simply perturbing expected sufficient statistics. Using Wikipedia data, we illustrate the effectiveness of our algorithm for large-scale data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/01/2016

Variational Bayes In Private Settings (VIPS)

We provide a general framework for privacy-preserving variational Bayes ...
research
12/05/2015

Stochastic Collapsed Variational Inference for Sequential Data

Stochastic variational inference for collapsed models has recently been ...
research
11/30/2017

Differentially Private Dropout

Large data collections required for the training of neural networks ofte...
research
11/30/2017

Differentially Private Variational Dropout

Deep neural networks with their large number of parameters are highly fl...
research
11/24/2015

Private Posterior distributions from Variational approximations

Privacy preserving mechanisms such as differential privacy inject additi...
research
02/27/2018

ADMM-based Networked Stochastic Variational Inference

Owing to the recent advances in "Big Data" modeling and prediction tasks...
research
05/24/2016

A note on privacy preserving iteratively reweighted least squares

Iteratively reweighted least squares (IRLS) is a widely-used method in m...

Please sign up or login with your details

Forgot password? Click here to reset