Data Isotopes for Data Provenance in DNNs

08/29/2022
by   Emily Wenger, et al.
5

Today, creators of data-hungry deep neural networks (DNNs) scour the Internet for training fodder, leaving users with little control over or knowledge of when their data is appropriated for model training. To empower users to counteract unwanted data use, we design, implement and evaluate a practical system that enables users to detect if their data was used to train an DNN model. We show how users can create special data points we call isotopes, which introduce "spurious features" into DNNs during training. With only query access to a trained model and no knowledge of the model training process, or control of the data labels, a user can apply statistical hypothesis testing to detect if a model has learned the spurious features associated with their isotopes by training on the user's data. This effectively turns DNNs' vulnerability to memorization and spurious correlations into a tool for data provenance. Our results confirm efficacy in multiple settings, detecting and distinguishing between hundreds of isotopes with high accuracy. We further show that our system works on public ML-as-a-service platforms and larger models such as ImageNet, can use physical objects instead of digital marks, and remains generally robust against several adaptive countermeasures.

READ FULL TEXT

page 1

page 2

page 4

page 5

page 6

page 9

page 12

page 16

research
04/10/2021

Use of Metamorphic Relations as Knowledge Carriers to Train Deep Neural Networks

Training multiple-layered deep neural networks (DNNs) is difficult. The ...
research
09/22/2018

The Optimal ANN Model for Predicting Bearing Capacity of Shallow Foundations Trained on Scarce Data

This study is focused on determining the potential of using deep neural ...
research
09/16/2020

Analysis of Generalizability of Deep Neural Networks Based on the Complexity of Decision Boundary

For supervised learning models, the analysis of generalization ability (...
research
11/28/2018

Strike (with) a Pose: Neural Networks Are Easily Fooled by Strange Poses of Familiar Objects

Despite excellent performance on stationary test sets, deep neural netwo...
research
05/04/2023

CAMEL: Co-Designing AI Models and Embedded DRAMs for Efficient On-Device Learning

The emergence of the Internet of Things (IoT) has resulted in a remarkab...
research
07/02/2020

A Novel DNN Training Framework via Data Sampling and Multi-Task Optimization

Conventional DNN training paradigms typically rely on one training set a...
research
01/07/2020

PaRoT: A Practical Framework for Robust Deep NeuralNetwork Training

Deep Neural Networks (DNNs) are finding important applications in safety...

Please sign up or login with your details

Forgot password? Click here to reset