Training Natural Language Processing Models on Encrypted Text for Enhanced Privacy

05/03/2023
by   Davut Emre Tasar, et al.
0

With the increasing use of cloud-based services for training and deploying machine learning models, data privacy has become a major concern. This is particularly important for natural language processing (NLP) models, which often process sensitive information such as personal communications and confidential documents. In this study, we propose a method for training NLP models on encrypted text data to mitigate data privacy concerns while maintaining similar performance to models trained on non-encrypted data. We demonstrate our method using two different architectures, namely Doc2Vec+XGBoost and Doc2Vec+LSTM, and evaluate the models on the 20 Newsgroups dataset. Our results indicate that both encrypted and non-encrypted models achieve comparable performance, suggesting that our encryption method is effective in preserving data privacy without sacrificing model accuracy. In order to replicate our experiments, we have provided a Colab notebook at the following address: https://t.ly/lR-TP

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/26/2019

Privacy preserving Neural Network Inference on Encrypted Data with GPUs

Machine Learning as a Service (MLaaS) has become a growing trend in rece...
research
01/10/2020

Designing a Bit-Based Model to Accelerate Query Processing Over Encrypted Databases in Cloud

Database users have started moving toward the use of cloud computing as ...
research
08/09/2022

Measuring the Availability and Response Times of Public Encrypted DNS Resolvers

Unencrypted DNS traffic between users and DNS resolvers can lead to priv...
research
08/19/2022

To show or not to show: Redacting sensitive text from videos of electronic displays

With the increasing prevalence of video recordings there is a growing ne...
research
04/20/2021

Robustness Tests of NLP Machine Learning Models: Search and Semantically Replace

This paper proposes a strategy to assess the robustness of different mac...
research
04/28/2023

Preserving Data Confidentiality in Association Rule Mining Using Data Share Allocator Algorithm

These days, investigations of information are becoming essential for var...
research
08/19/2019

PrivFT: Private and Fast Text Classification with Homomorphic Encryption

Privacy and security have increasingly become a concern for computing se...

Please sign up or login with your details

Forgot password? Click here to reset