Attack on Unfair ToS Clause Detection: A Case Study using Universal Adversarial Triggers

11/28/2022
by   Shanshan Xu, et al.
0

Recent work has demonstrated that natural language processing techniques can support consumer protection by automatically detecting unfair clauses in the Terms of Service (ToS) Agreement. This work demonstrates that transformer-based ToS analysis systems are vulnerable to adversarial attacks. We conduct experiments attacking an unfair-clause detector with universal adversarial triggers. Experiments show that a minor perturbation of the text can considerably reduce the detection performance. Moreover, to measure the detectability of the triggers, we conduct a detailed human evaluation study by collecting both answer accuracy and response time from the participants. The results show that the naturalness of the triggers remains key to tricking readers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/13/2021

TREATED:Towards Universal Defense against Textual Adversarial Attacks

Recent work shows that deep neural networks are vulnerable to adversaria...
research
06/21/2023

Sample Attackability in Natural Language Adversarial Attacks

Adversarial attack research in natural language processing (NLP) has mad...
research
05/01/2020

Universal Adversarial Attacks with Natural Triggers for Text Classification

Recent work has demonstrated the vulnerability of modern text classifier...
research
05/26/2022

Denial-of-Service Attack on Object Detection Model Using Universal Adversarial Perturbation

Adversarial attacks against deep learning-based object detectors have be...
research
09/01/2023

Why do universal adversarial attacks work on large language models?: Geometry might be the answer

Transformer based large language models with emergent capabilities are b...
research
11/13/2020

Transformer-Encoder Detector Module: Using Context to Improve Robustness to Adversarial Attacks on Object Detection

Deep neural network approaches have demonstrated high performance in obj...
research
05/30/2019

Identifying Classes Susceptible to Adversarial Attacks

Despite numerous attempts to defend deep learning based image classifier...

Please sign up or login with your details

Forgot password? Click here to reset