Privacy-Adaptive BERT for Natural Language Understanding

04/15/2021
by   Chen Qu, et al.
10

When trying to apply the recent advance of Natural Language Understanding (NLU) technologies to real-world applications, privacy preservation imposes a crucial challenge, which, unfortunately, has not been well resolved. To address this issue, we study how to improve the effectiveness of NLU models under a Local Privacy setting, using BERT, a widely-used pretrained Language Model (LM), as an example. We systematically study the strengths and weaknesses of imposing dx-privacy, a relaxed variant of Local Differential Privacy, at different stages of language modeling: input text, token embeddings, and sequence representations. We then focus on the former two with privacy-constrained fine-tuning experiments to reveal the utility of BERT under local privacy constraints. More importantly, to the best of our knowledge, we are the first to propose privacy-adaptive LM pretraining methods and demonstrate that they can significantly improve model performance on privatized text input. We also interpret the level of privacy preservation and provide our guidance on privacy parameter selections.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/02/2021

Differential Privacy for Text Analytics via Natural Text Sanitization

Texts convey sophisticated knowledge. However, texts also convey sensiti...
research
07/04/2022

A Customised Text Privatisation Mechanism with Differential Privacy

In Natural Language Understanding (NLU) applications, training an effect...
research
10/05/2021

Task-aware Privacy Preservation for Multi-dimensional Data

Local differential privacy (LDP), a state-of-the-art technique for priva...
research
12/04/2022

Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE

This technical report briefly describes our JDExplore d-team's Vega v2 s...
research
01/29/2021

Fine-tuning BERT-based models for Plant Health Bulletin Classification

In the era of digitization, different actors in agriculture produce nume...
research
03/12/2022

On Information Hiding in Natural Language Systems

With data privacy becoming more of a necessity than a luxury in today's ...
research
05/24/2023

Trade-Offs Between Fairness and Privacy in Language Modeling

Protecting privacy in contemporary NLP models is gaining in importance. ...

Please sign up or login with your details

Forgot password? Click here to reset