ER-AE: Differentially-private Text Generation for Authorship Anonymization

07/20/2019
by   Haohan Bo, et al.
0

Most of privacy protection studies for textual data focus on removing explicit sensitive identifiers. However, personal writing style, as a strong indicator of the authorship, is often neglected. Recent studies on writing style anonymization can only output numeric vectors which are difficult for the recipients to interpret. We propose a novel text generation model for authorship anonymization. Combined with a semantic embedding reward loss function and the exponential mechanism, our proposed auto-encoder can generate differentially-private sentences that have a close semantic and similar grammatical structure to the original text while removing personal traits of the writing style. It does not require any conditioned labels or paralleled text data during training. We evaluate the performance of the proposed model on the real-life peer reviews dataset and the Yelp review dataset. The result suggests that our model outperforms the state-of-the-art on semantic preservation, authorship obfuscation, and stylometric transformation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/29/2021

ADePT: Auto-encoder based Differentially Private Text Transformation

Privacy is an important concern when building statistical models on data...
research
02/16/2021

Differentially Private Quantiles

Quantiles are often used for summarizing and understanding data. If that...
research
05/23/2017

Fast and Differentially Private Algorithms for Decentralized Collaborative Machine Learning

Consider a set of agents in a peer-to-peer communication network, where ...
research
10/20/2022

TraVaS: Differentially Private Trace Variant Selection for Process Mining

In the area of industrial process mining, privacy-preserving event data ...
research
04/23/2021

On a Utilitarian Approach to Privacy Preserving Text Generation

Differentially-private mechanisms for text generation typically add care...
research
11/03/2022

Time-aware Prompting for Text Generation

In this paper, we study the effects of incorporating timestamps, such as...
research
07/23/2019

Learning to Select, Track, and Generate for Data-to-Text

We propose a data-to-text generation model with two modules, one for tra...

Please sign up or login with your details

Forgot password? Click here to reset