Apple of Sodom: Hidden Backdoors in Superior Sentence Embeddings via Contrastive Learning

10/20/2022
by   Xiaoyi Chen, et al.
0

This paper finds that contrastive learning can produce superior sentence embeddings for pre-trained models but is also vulnerable to backdoor attacks. We present the first backdoor attack framework, BadCSE, for state-of-the-art sentence embeddings under supervised and unsupervised learning settings. The attack manipulates the construction of positive and negative pairs so that the backdoored samples have a similar embedding with the target sample (targeted attack) or the negative embedding of its clean version (non-targeted attack). By injecting the backdoor in sentence embeddings, BadCSE is resistant against downstream fine-tuning. We evaluate BadCSE on both STS tasks and other downstream tasks. The supervised non-targeted attack obtains a performance degradation of 194.86 the target embedding with a 97.70 utility.

READ FULL TEXT
research
06/06/2022

Improving Contrastive Learning of Sentence Embeddings with Case-Augmented Positives and Retrieved Negatives

Following SimCSE, contrastive learning based methods have achieved the s...
research
06/09/2021

Sentence Embeddings using Supervised Contrastive Learning

Sentence embeddings encode sentences in fixed dense vectors and have pla...
research
09/22/2022

An Information Minimization Based Contrastive Learning Model for Unsupervised Sentence Embeddings Learning

Unsupervised sentence embeddings learning has been recently dominated by...
research
09/09/2021

ESimCSE: Enhanced Sample Building Method for Contrastive Learning of Unsupervised Sentence Embedding

Contrastive learning has been attracting much attention for learning uns...
research
05/24/2023

Contrastive Learning of Sentence Embeddings from Scratch

Contrastive learning has been the dominant approach to train state-of-th...
research
05/16/2023

UOR: Universal Backdoor Attacks on Pre-trained Language Models

Backdoors implanted in pre-trained language models (PLMs) can be transfe...
research
03/31/2020

Information Leakage in Embedding Models

Embeddings are functions that map raw input data to low-dimensional vect...

Please sign up or login with your details

Forgot password? Click here to reset