Déjà Vu: an empirical evaluation of the memorization properties of ConvNets

09/17/2018
by   Alexandre Sablayrolles, et al.
0

Convolutional neural networks memorize part of their training data, which is why strategies such as data augmentation and drop-out are employed to mitigate overfitting. This paper considers the related question of "membership inference", where the goal is to determine if an image was used during training. We consider it under three complementary angles. We show how to detect which dataset was used to train a model, and in particular whether some validation images were used at train time. We then analyze explicit memorization and extend classical random label experiments to the problem of learning a model that predicts if an image belongs to an arbitrary set. Finally, we propose a new approach to infer membership when a few of the top layers are not available or have been fine-tuned, and show that lower layers still carry information about the training samples. To support our findings, we conduct large-scale experiments on Imagenet and subsets of YFCC-100M with modern architectures such as VGG and Resnet.

READ FULL TEXT

page 20

page 21

research
02/03/2020

Radioactive data: tracing through training

We want to detect whether a particular image dataset has been used to tr...
research
03/07/2023

Can Membership Inferencing be Refuted?

Membership inference (MI) attack is currently the most popular test for ...
research
09/17/2020

An Extension of Fano's Inequality for Characterizing Model Susceptibility to Membership Inference Attacks

Deep neural networks have been shown to be vulnerable to membership infe...
research
02/02/2022

Parameters or Privacy: A Provable Tradeoff Between Overparameterization and Membership Inference

A surprising phenomenon in modern machine learning is the ability of a h...
research
05/18/2022

Large Neural Networks Learning from Scratch with Very Few Data and without Regularization

Recent findings have shown that Neural Networks generalize also in over-...
research
11/01/2018

The Natural Auditor: How To Tell If Someone Used Your Words To Train Their Model

To help enforce data-protection regulations such as GDPR and detect unau...
research
06/09/2023

Understanding the Benefits of Image Augmentations

Image Augmentations are widely used to reduce overfitting in neural netw...

Please sign up or login with your details

Forgot password? Click here to reset