10 Security and Privacy Problems in Self-Supervised Learning

10/28/2021
by   Jinyuan Jia, et al.
0

Self-supervised learning has achieved revolutionary progress in the past several years and is commonly believed to be a promising approach for general-purpose AI. In particular, self-supervised learning aims to pre-train an encoder using a large amount of unlabeled data. The pre-trained encoder is like an "operating system" of the AI ecosystem. Specifically, the encoder can be used as a feature extractor for many downstream tasks with little or no labeled training data. Existing studies on self-supervised learning mainly focused on pre-training a better encoder to improve its performance on downstream tasks in non-adversarial settings, leaving its security and privacy in adversarial settings largely unexplored. A security or privacy issue of a pre-trained encoder leads to a single point of failure for the AI ecosystem. In this book chapter, we discuss 10 basic security and privacy problems for the pre-trained encoders in self-supervised learning, including six confidentiality problems, three integrity problems, and one availability problem. For each problem, we discuss potential opportunities and challenges. We hope our book chapter will inspire future research on the security and privacy of self-supervised learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/15/2022

StolenEncoder: Stealing Pre-trained Encoders

Pre-trained encoders are general-purpose feature extractors that can be ...
research
12/06/2022

Pre-trained Encoders in Self-Supervised Learning Improve Secure and Privacy-preserving Supervised Learning

Classifiers in supervised learning have various security and privacy iss...
research
03/03/2023

Towards Democratizing Joint-Embedding Self-Supervised Learning

Joint Embedding Self-Supervised Learning (JE-SSL) has seen rapid develop...
research
08/09/2023

SSL-Auth: An Authentication Framework by Fragile Watermarking for Pre-trained Encoders in Self-supervised Learning

Self-supervised learning (SSL), utilizing unlabeled datasets for trainin...
research
01/07/2023

REaaS: Enabling Adversarially Robust Downstream Classifiers via Robust Encoder as a Service

Encoder as a service is an emerging cloud service. Specifically, a servi...
research
07/23/2023

Downstream-agnostic Adversarial Examples

Self-supervised learning usually uses a large amount of unlabeled data t...
research
06/14/2021

Pre-Trained Models: Past, Present and Future

Large-scale pre-trained models (PTMs) such as BERT and GPT have recently...

Please sign up or login with your details

Forgot password? Click here to reset