Sentence-level Privacy for Document Embeddings

05/10/2022
by   Casey Meehan, et al.
0

User language data can contain highly sensitive personal content. As such, it is imperative to offer users a strong and interpretable privacy guarantee when learning from their data. In this work, we propose SentDP: pure local differential privacy at the sentence level for a single user document. We propose a novel technique, DeepCandidate, that combines concepts from robust statistics and language modeling to produce high-dimensional, general-purpose ϵ-SentDP document embeddings. This guarantees that any single sentence in a document can be substituted with any other sentence while keeping the embedding ϵ-indistinguishable. Our experiments indicate that these private document embeddings are useful for downstream tasks like sentiment analysis and topic classification and even outperform baseline methods with weaker guarantees like word-level Metric DP.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/20/2022

Learning to Generate Image Embeddings with User-level Differential Privacy

Small on-device models have been successfully trained with user-level di...
research
10/19/2020

Locality Sensitive Hashing with Extended Differential Privacy

Extended differential privacy, a generalization of standard differential...
research
04/28/2023

Are the Best Multilingual Document Embeddings simply Based on Sentence Embeddings?

Dense vector representations for textual data are crucial in modern NLP....
research
09/19/2023

A Neighbourhood-Aware Differential Privacy Mechanism for Static Word Embeddings

We propose a Neighbourhood-Aware Differential Privacy (NADP) mechanism c...
research
06/20/2019

Hierarchical Document Encoder for Parallel Corpus Mining

We explore using multilingual document embeddings for nearest neighbor m...
research
06/04/2019

Towards Lossless Encoding of Sentences

A lot of work has been done in the field of image compression via machin...
research
09/12/2017

StarSpace: Embed All The Things!

We present StarSpace, a general-purpose neural embedding model that can ...

Please sign up or login with your details

Forgot password? Click here to reset