StolenEncoder: Stealing Pre-trained Encoders

01/15/2022
by   Yupei Liu, et al.
0

Pre-trained encoders are general-purpose feature extractors that can be used for many downstream tasks. Recent progress in self-supervised learning can pre-train highly effective encoders using a large volume of unlabeled data, leading to the emerging encoder as a service (EaaS). A pre-trained encoder may be deemed confidential because its training often requires lots of data and computation resources as well as its public release may facilitate misuse of AI, e.g., for deepfakes generation. In this paper, we propose the first attack called StolenEncoder to steal pre-trained image encoders. We evaluate StolenEncoder on multiple target encoders pre-trained by ourselves and three real-world target encoders including the ImageNet encoder pre-trained by Google, CLIP encoder pre-trained by OpenAI, and Clarifai's General Embedding encoder deployed as a paid EaaS. Our results show that the encoders stolen by StolenEncoder have similar functionality with the target encoders. In particular, the downstream classifiers built upon a target encoder and a stolen encoder have similar accuracy. Moreover, stealing a target encoder using StolenEncoder requires much less data and computation resources than pre-training it from scratch. We also explore three defenses that perturb feature vectors produced by a target encoder. Our evaluation shows that these defenses are not enough to mitigate StolenEncoder.

READ FULL TEXT
research
08/01/2021

BadEncoder: Backdoor Attacks to Pre-trained Encoders in Self-Supervised Learning

Self-supervised learning in computer vision aims to pre-train an image e...
research
08/09/2023

SSL-Auth: An Authentication Framework by Fragile Watermarking for Pre-trained Encoders in Self-supervised Learning

Self-supervised learning (SSL), utilizing unlabeled datasets for trainin...
research
10/28/2021

10 Security and Privacy Problems in Self-Supervised Learning

Self-supervised learning has achieved revolutionary progress in the past...
research
10/12/2022

One does not fit all! On the Complementarity of Vision Encoders for Vision and Language Tasks

Current multimodal models, aimed at solving Vision and Language (V+L) ta...
research
08/25/2021

EncoderMI: Membership Inference against Pre-trained Encoders in Contrastive Learning

Given a set of unlabeled images or (image, text) pairs, contrastive lear...
research
11/20/2022

ESTAS: Effective and Stable Trojan Attacks in Self-supervised Encoders with One Target Unlabelled Sample

Emerging self-supervised learning (SSL) has become a popular image repre...
research
07/23/2023

Downstream-agnostic Adversarial Examples

Self-supervised learning usually uses a large amount of unlabeled data t...

Please sign up or login with your details

Forgot password? Click here to reset