PETGEN: Personalized Text Generation Attack on Deep Sequence Embedding-based Classification Models

09/14/2021
by   Bing He, et al.
0

What should a malicious user write next to fool a detection model? Identifying malicious users is critical to ensure the safety and integrity of internet platforms. Several deep learning based detection models have been created. However, malicious users can evade deep detection models by manipulating their behavior, rendering these models of little use. The vulnerability of such deep detection models against adversarial attacks is unknown. Here we create a novel adversarial attack model against deep user sequence embedding-based classification models, which use the sequence of user posts to generate user embeddings and detect malicious users. In the attack, the adversary generates a new post to fool the classifier. We propose a novel end-to-end Personalized Text Generation Attack model, called , that simultaneously reduces the efficacy of the detection model and generates posts that have several key desirable properties. Specifically, generates posts that are personalized to the user's writing style, have knowledge about a given target context, are aware of the user's historical posts on the target context, and encapsulate the user's recent topical interests. We conduct extensive experiments on two real-world datasets (Yelp and Wikipedia, both with ground-truth of malicious users) to show that significantly reduces the performance of popular deep user sequence embedding-based classification models. outperforms five attack baselines in terms of text quality and attack efficacy in both white-box and black-box classifier settings. Overall, this work paves the path towards the next generation of adversary-aware sequence classification models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/06/2020

Black-box Adversarial Example Generation with Normalizing Flows

Deep neural network classifiers suffer from adversarial vulnerability: w...
research
08/31/2019

Generating Personalized Recipes from Historical User Preferences

Existing approaches to recipe generation are unable to create recipes fo...
research
04/26/2021

Delving into Data: Effectively Substitute Training for Black-box Attack

Deep models have shown their vulnerability when processing adversarial s...
research
01/13/2018

Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers

Although various techniques have been proposed to generate adversarial s...
research
08/09/2019

DeepCleanse: A Black-box Input SanitizationFramework Against Backdoor Attacks on DeepNeural Networks

As Machine Learning, especially Deep Learning, has been increasingly use...
research
01/17/2023

Follow Us and Become Famous! Insights and Guidelines From Instagram Engagement Mechanisms

With 1.3 billion users, Instagram (IG) has also become a business tool. ...
research
02/03/2022

VindiCo: Privacy Safeguard Against Adaptation Based Spyware in Human-in-the-Loop IoT

Personalized IoT adapts their behavior based on contextual information, ...

Please sign up or login with your details

Forgot password? Click here to reset