Training Set Camouflage

12/13/2018
by   Ayon Sen, et al.
0

We introduce a form of steganography in the domain of machine learning which we call training set camouflage. Imagine Alice has a training set on an illicit machine learning classification task. Alice wants Bob (a machine learning system) to learn the task. However, sending either the training set or the trained model to Bob can raise suspicion if the communication is monitored. Training set camouflage allows Alice to compute a second training set on a completely different -- and seemingly benign -- classification task. By construction, sending the second training set will not raise suspicion. When Bob applies his standard (public) learning algorithm to the second training set, he approximately recovers the classifier on the original task. Training set camouflage is a novel form of steganography in machine learning. We formulate training set camouflage as a combinatorial bilevel optimization problem and propose solvers based on nonlinear programming and local search. Experiments on real classification tasks demonstrate the feasibility of such camouflage.

READ FULL TEXT
research
01/24/2018

Training Set Debugging Using Trusted Items

Training set bugs are flaws in the data that adversely affect machine le...
research
01/09/2021

SARS-Cov-2 RNA Sequence Classification Based on Territory Information

CovID-19 genetics analysis is critical to determine virus type,virus var...
research
06/14/2016

DCNNs on a Diet: Sampling Strategies for Reducing the Training Set Size

Large-scale supervised classification algorithms, especially those based...
research
09/04/2022

Data Provenance via Differential Auditing

Auditing Data Provenance (ADP), i.e., auditing if a certain piece of dat...
research
11/11/2019

Rethinking Generalisation

In this paper, we present a new approach to computing the generalisation...
research
07/08/2021

SSSE: Efficiently Erasing Samples from Trained Machine Learning Models

The availability of large amounts of user-provided data has been key to ...
research
10/25/2020

AutoSpeech 2020: The Second Automated Machine Learning Challenge for Speech Classification

The AutoSpeech challenge calls for automated machine learning (AutoML) s...

Please sign up or login with your details

Forgot password? Click here to reset