Provable Copyright Protection for Generative Models

02/21/2023
by   Nikhil Vyas, et al.
0

There is a growing concern that learned conditional generative models may output samples that are substantially similar to some copyrighted data C that was in their training set. We give a formal definition of near access-freeness (NAF) and prove bounds on the probability that a model satisfying this definition outputs a sample similar to C, even if C is included in its training set. Roughly speaking, a generative model p is k-NAF if for every potentially copyrighted data C, the output of p diverges by at most k-bits from the output of a model q that did not access C at all. We also give generative model learning algorithms, which efficiently modify the original generative model learning algorithm in a black box manner, that output generative models with strong bounds on the probability of sampling protected content. Furthermore, we provide promising experiments for both language (transformers) and image (diffusion) generative models, showing minimal degradation in output quality while ensuring strong protections against sampling protected content.

READ FULL TEXT
research
07/06/2021

Provable Lipschitz Certification for Generative Models

We present a scalable technique for upper bounding the Lipschitz constan...
research
05/25/2017

Latent Geometry and Memorization in Generative Models

It can be difficult to tell whether a trained generative model has learn...
research
05/24/2023

On the Generalization of Diffusion Model

The diffusion probabilistic generative models are widely used to generat...
research
11/29/2022

Taming a Generative Model

Generative models are becoming ever more powerful, being able to synthes...
research
02/25/2023

Data-Copying in Generative Models: A Formal Framework

There has been some recent interest in detecting and addressing memoriza...
research
11/07/2022

Proper losses for discrete generative models

We initiate the study of proper losses for evaluating generative models ...
research
06/02/2023

Sampling and Ranking for Digital Ink Generation on a tight computational budget

Digital ink (online handwriting) generation has a number of potential ap...

Please sign up or login with your details

Forgot password? Click here to reset