Differentially Private n-gram Extraction

08/05/2021
by   Kunho Kim, et al.
0

We revisit the problem of n-gram extraction in the differential privacy setting. In this problem, given a corpus of private text data, the goal is to release as many n-grams as possible while preserving user level privacy. Extracting n-grams is a fundamental subroutine in many NLP applications such as sentence completion, response generation for emails etc. The problem also arises in other applications such as sequence mining, and is a generalization of recently studied differentially private set union (DPSU). In this paper, we develop a new differentially private algorithm for this problem which, in our experiments, significantly outperforms the state-of-the-art. Our improvements stem from combining recent advances in DPSU, privacy accounting, and new heuristics for pruning in the tree-based approach initiated by Chen et al. (2012).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/13/2020

Auditing Differentially Private Machine Learning: How Private is Private SGD?

We investigate whether Differentially Private SGD offers better privacy ...
research
02/22/2020

Differentially Private Set Union

We study the basic operation of set union in the global model of differe...
research
10/20/2022

TraVaS: Differentially Private Trace Variant Selection for Process Mining

In the area of industrial process mining, privacy-preserving event data ...
research
03/18/2023

The Challenge of Differentially Private Screening Rules

Linear L_1-regularized models have remained one of the simplest and most...
research
07/21/2023

Differentially Private Heavy Hitter Detection using Federated Analytics

In this work, we study practical heuristics to improve the performance o...
research
06/05/2020

Differentially private partition selection

Many data analysis operations can be expressed as a GROUP BY query on an...
research
08/31/2019

Publishing Community-Preserving Attributed Social Graphs with a Differential Privacy Guarantee

We present a novel method for publishing differentially private syntheti...

Please sign up or login with your details

Forgot password? Click here to reset