EvaLDA: Efficient Evasion Attacks Towards Latent Dirichlet Allocation

by   Qi Zhou, et al.

As one of the most powerful topic models, Latent Dirichlet Allocation (LDA) has been used in a vast range of tasks, including document understanding, information retrieval and peer-reviewer assignment. Despite its tremendous popularity, the security of LDA has rarely been studied. This poses severe risks to security-critical tasks such as sentiment analysis and peer-reviewer assignment that are based on LDA. In this paper, we are interested in knowing whether LDA models are vulnerable to adversarial perturbations of benign document examples during inference time. We formalize the evasion attack to LDA models as an optimization problem and prove it to be NP-hard. We then propose a novel and efficient algorithm, EvaLDA to solve it. We show the effectiveness of EvaLDA via extensive empirical evaluations. For instance, in the NIPS dataset, EvaLDA can averagely promote the rank of a target topic from 10 to around 7 by only replacing 1 work provides significant insights into the power and limitations of evasion attacks to LDA models.



There are no comments yet.


page 1

page 2

page 3

page 4


Modeling Word Relatedness in Latent Dirichlet Allocation

Standard LDA model suffers the problem that the topic assignment of each...

Learning from LDA using Deep Neural Networks

Latent Dirichlet Allocation (LDA) is a three-level hierarchical Bayesian...

Discriminative Topic Modeling with Logistic LDA

Despite many years of research into latent Dirichlet allocation (LDA), a...

Application of Topic Models to Judgments from Public Procurement Domain

In this work, automatic analysis of themes contained in a large corpora ...

The Hitchhiker's Guide to LDA

Latent Dirichlet Allocation (LDA) model is a famous model in the topic m...

WarpLDA: a Cache Efficient O(1) Algorithm for Latent Dirichlet Allocation

Developing efficient and scalable algorithms for Latent Dirichlet Alloca...

Topic Modelling of Empirical Text Corpora: Validity, Reliability, and Reproducibility in Comparison to Semantic Maps

Using the 6,638 case descriptions of societal impact submitted for evalu...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.