Privacy-Preserving Public Release of Datasets for Support Vector Machine Classification

12/29/2019
by   Farhad Farokhi, et al.
43

We consider the problem of publicly releasing a dataset for support vector machine classification while not infringing on the privacy of data subjects (i.e., individuals whose private information is stored in the dataset). The dataset is systematically obfuscated using an additive noise for privacy protection. Motivated by the Cramer-Rao bound, inverse of the trace of the Fisher information matrix is used as a measure of the privacy. Conditions are established for ensuring that the classifier extracted from the original dataset and the obfuscated one are close to each other (capturing the utility). The optimal noise distribution is determined by maximizing a weighted sum of the measures of privacy and utility. The optimal privacy-preserving noise is proved to achieve local differential privacy. The results are generalized to a broader class of optimization-based supervised machine learning algorithms. Applicability of the methodology is demonstrated on multiple datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/28/2018

Ensuring Privacy with Constrained Additive Noise by Minimizing Fisher Information

The problem of preserving the privacy of individual entries of a databas...
research
08/14/2019

Taking a Lesson from Quantum Particles for Statistical Data Privacy

Privacy is under threat from artificial intelligence revolution fueled b...
research
10/07/2020

General Confidentiality and Utility Metrics for Privacy-Preserving Data Publishing Based on the Permutation Model

Anonymization for privacy-preserving data publishing, also known as stat...
research
08/11/2020

Security Versus Privacy

Linear queries can be submitted to a server containing private data. The...
research
08/20/2019

Privacy-Preserving Support Vector Machine Computing Using Random Unitary Transformation

A privacy-preserving support vector machine (SVM) computing scheme is pr...
research
11/12/2019

Developing Non-Stochastic Privacy-Preserving Policies Using Agglomerative Clustering

We consider a non-stochastic privacy-preserving problem in which an adve...
research
06/05/2023

A Privacy-Preserving Federated Learning Approach for Kernel methods

It is challenging to implement Kernel methods, if the data sources are d...

Please sign up or login with your details

Forgot password? Click here to reset