Unlearnable Clusters: Towards Label-agnostic Unlearnable Examples

12/31/2022
by   Jiaming Zhang, et al.
0

There is a growing interest in developing unlearnable examples (UEs) against visual privacy leaks on the Internet. UEs are training samples added with invisible but unlearnable noise, which have been found can prevent unauthorized training of machine learning models. UEs typically are generated via a bilevel optimization framework with a surrogate model to remove (minimize) errors from the original samples, and then applied to protect the data against unknown target models. However, existing UE generation methods all rely on an ideal assumption called label-consistency, where the hackers and protectors are assumed to hold the same label for a given sample. In this work, we propose and promote a more practical label-agnostic setting, where the hackers may exploit the protected data quite differently from the protectors. E.g., a m-class unlearnable dataset held by the protector may be exploited by the hacker as a n-class dataset. Existing UE generation methods are rendered ineffective in this challenging setting. To tackle this challenge, we present a novel technique called Unlearnable Clusters (UCs) to generate label-agnostic unlearnable examples with cluster-wise perturbations. Furthermore, we propose to leverage VisionandLanguage Pre-trained Models (VLPMs) like CLIP as the surrogate model to improve the transferability of the crafted UCs to diverse domains. We empirically verify the effectiveness of our proposed approach under a variety of settings with different datasets, target models, and even commercial platforms Microsoft Azure and Baidu PaddlePaddle.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/07/2023

Universal Semi-supervised Model Adaptation via Collaborative Consistency Training

In this paper, we introduce a realistic and challenging domain adaptatio...
research
10/18/2022

Transferable Unlearnable Examples

With more people publishing their personal data online, unauthorized dat...
research
03/11/2023

DETA: Denoised Task Adaptation for Few-Shot Learning

Test-time task adaptation in few-shot learning aims to adapt a pre-train...
research
12/21/2022

Class Prototype-based Cleaner for Label Noise Learning

Semi-supervised learning based methods are current SOTA solutions to the...
research
01/13/2021

Unlearnable Examples: Making Personal Data Unexploitable

The volume of "free" data on the internet has been key to the current su...
research
03/20/2023

Did You Train on My Dataset? Towards Public Dataset Protection with Clean-Label Backdoor Watermarking

The huge supporting training data on the Internet has been a key factor ...
research
06/03/2023

Deep Classifier Mimicry without Data Access

Access to pre-trained models has recently emerged as a standard across n...

Please sign up or login with your details

Forgot password? Click here to reset