Synthetic Text Generation with Differential Privacy: A Simple and Practical Recipe

10/25/2022
by   Xiang Yue, et al.
0

Privacy concerns have attracted increasing attention in data-driven products and services. Existing legislation forbids arbitrary processing of personal data collected from individuals. Generating synthetic versions of such data with a formal privacy guarantee such as differential privacy (DP) is considered to be a solution to address privacy concerns. In this direction, we show a simple, practical, and effective recipe in the text domain: simply fine-tuning a generative language model with DP allows us to generate useful synthetic text while mitigating privacy concerns. Through extensive empirical analyses, we demonstrate that our method produces synthetic data that is competitive in terms of utility with its non-private counterpart and meanwhile provides strong protection against potential privacy leakages.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/03/2022

Randomized Privacy Budget Differential Privacy

While pursuing better utility by discovering knowledge from the data, in...
research
11/28/2022

On the Utility Recovery Incapability of Neural Net-based Differential Private Tabular Training Data Synthesizer under Privacy Deregulation

Devising procedures for auditing generative model privacy-utility tradeo...
research
06/07/2023

Privately generating tabular data using language models

Privately generating synthetic data from a table is an important brick o...
research
06/02/2021

Differential Privacy for Text Analytics via Natural Text Sanitization

Texts convey sophisticated knowledge. However, texts also convey sensiti...
research
01/21/2019

Differential Privacy for Power Grid Obfuscation

The availability of high-fidelity energy networks brings significant val...
research
05/18/2022

GeoPointGAN: Synthetic Spatial Data with Local Label Differential Privacy

Synthetic data generation is a fundamental task for many data management...
research
05/10/2022

Mechanisms for Global Differential Privacy under Bayesian Data Synthesis

This paper introduces a new method that embeds any Bayesian model used t...

Please sign up or login with your details

Forgot password? Click here to reset