Do SSL Models Have Déjà Vu? A Case of Unintended Memorization in Self-supervised Learning

04/26/2023
by   Casey Meehan, et al.
0

Self-supervised learning (SSL) algorithms can produce useful image representations by learning to associate different parts of natural images with one another. However, when taken to the extreme, SSL models can unintendedly memorize specific parts in individual training samples rather than learning semantically meaningful associations. In this work, we perform a systematic study of the unintended memorization of image-specific information in SSL models – which we refer to as déjà vu memorization. Concretely, we show that given the trained model and a crop of a training image containing only the background (e.g., water, sky, grass), it is possible to infer the foreground object with high accuracy or even visually reconstruct it. Furthermore, we show that déjà vu memorization is common to different SSL algorithms, is exacerbated by certain design choices, and cannot be detected by conventional techniques for evaluating representation quality. Our study of déjà vu memorization reveals previously unknown privacy risks in SSL models, as well as suggests potential practical mitigation strategies. Code is available at https://github.com/facebookresearch/DejaVu.

READ FULL TEXT

page 2

page 4

page 8

page 9

page 13

page 14

page 22

page 23

research
04/20/2022

Self-supervised Learning for Sonar Image Classification

Self-supervised learning has proved to be a powerful approach to learn i...
research
08/15/2021

SSH: A Self-Supervised Framework for Image Harmonization

Image harmonization aims to improve the quality of image compositing by ...
research
12/04/2020

Super-Selfish: Self-Supervised Learning on Images with PyTorch

Super-Selfish is an easy to use PyTorch framework for image-based self-s...
research
04/04/2023

Defending Against Patch-based Backdoor Attacks on Self-Supervised Learning

Recently, self-supervised learning (SSL) was shown to be vulnerable to p...
research
06/17/2021

A Random CNN Sees Objects: One Inductive Bias of CNN and Its Applications

This paper starts by revealing a surprising finding: without any learnin...
research
09/04/2022

Multi-modal Masked Autoencoders Learn Compositional Histopathological Representations

Self-supervised learning (SSL) enables learning useful inductive biases ...
research
09/07/2021

Self-Supervised Representation Learning using Visual Field Expansion on Digital Pathology

The examination of histopathology images is considered to be the gold st...

Please sign up or login with your details

Forgot password? Click here to reset