Fast Privacy-Preserving Text Classification based on Secure Multiparty Computation

01/18/2021
by   Amanda Resende, et al.
0

We propose a privacy-preserving Naive Bayes classifier and apply it to the problem of private text classification. In this setting, a party (Alice) holds a text message, while another party (Bob) holds a classifier. At the end of the protocol, Alice will only learn the result of the classifier applied to her text input and Bob learns nothing. Our solution is based on Secure Multiparty Computation (SMC). Our Rust implementation provides a fast and secure solution for the classification of unstructured text. Applying our solution to the case of spam detection (the solution is generic, and can be used in any other scenario in which the Naive Bayes classifier can be employed), we can classify an SMS as spam or ham in less than 340ms in the case where the dictionary size of Bob's model includes all words (n = 5200) and Alice's SMS has at most m = 160 unigrams. In the case with n = 369 and m = 8 (the average of a spam SMS in the database), our solution takes only 21ms.

READ FULL TEXT
research
06/05/2019

Privacy-Preserving Classification of Personal Text Messages with Secure Multi-Party Computation: An Application to Hate-Speech Detection

Classification of personal text messages has many useful applications in...
research
06/19/2018

Private Text Classification

Confidential text corpora exist in many forms, but do not allow arbitrar...
research
09/20/2015

Early text classification: a Naive solution

Text classification is a widely studied problem, and it can be considere...
research
08/24/2018

Building a Robust Text Classifier on a Test-Time Budget

We propose a generic and interpretable learning framework for building r...
research
07/01/2020

Private Speech Characterization with Secure Multiparty Computation

Deep learning in audio signal processing, such as human voice audio sign...
research
10/05/2022

Privacy-Preserving Text Classification on BERT Embeddings with Homomorphic Encryption

Embeddings, which compress information in raw text into semantics-preser...
research
07/22/2023

Towards Vertical Privacy-Preserving Symbolic Regression via Secure Multiparty Computation

Symbolic Regression is a powerful data-driven technique that searches fo...

Please sign up or login with your details

Forgot password? Click here to reset