Tracing the Origin of Adversarial Attack for Forensic Investigation and Deterrence

12/31/2022
by   Han Fang, et al.
0

Deep neural networks are vulnerable to adversarial attacks. In this paper, we take the role of investigators who want to trace the attack and identify the source, that is, the particular model which the adversarial examples are generated from. Techniques derived would aid forensic investigation of attack incidents and serve as deterrence to potential attacks. We consider the buyers-seller setting where a machine learning model is to be distributed to various buyers and each buyer receives a slightly different copy with same functionality. A malicious buyer generates adversarial examples from a particular copy ℳ_i and uses them to attack other copies. From these adversarial examples, the investigator wants to identify the source ℳ_i. To address this problem, we propose a two-stage separate-and-trace framework. The model separation stage generates multiple copies of a model for a same classification task. This process injects unique characteristics into each copy so that adversarial examples generated have distinct and traceable features. We give a parallel structure which embeds a “tracer” in each copy, and a noise-sensitive training loss to achieve this goal. The tracing stage takes in adversarial examples and a few candidate models, and identifies the likely source. Based on the unique features induced by the noise-sensitive loss function, we could effectively trace the potential adversarial copy by considering the output logits from each tracer. Empirical results show that it is possible to trace the origin of the adversarial example and the mechanism can be applied to a wide range of architectures and datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/13/2021

Generating Unrestricted Adversarial Examples via Three Parameters

Deep neural networks have been shown to be vulnerable to adversarial exa...
research
06/09/2021

Towards Defending against Adversarial Examples via Attack-Invariant Features

Deep neural networks (DNNs) are vulnerable to adversarial noise. Their a...
research
11/30/2021

Mitigating Adversarial Attacks by Distributing Different Copies to Different Users

Machine learning models are vulnerable to adversarial attacks. In this p...
research
06/02/2023

Adaptive Attractors: A Defense Strategy against ML Adversarial Collusion Attacks

In the seller-buyer setting on machine learning models, the seller gener...
research
07/01/2023

Common Knowledge Learning for Generating Transferable Adversarial Examples

This paper focuses on an important type of black-box attacks, i.e., tran...
research
04/08/2018

Adaptive Spatial Steganography Based on Probability-Controlled Adversarial Examples

Deep learning model is vulnerable to adversarial attack, which generates...
research
09/13/2019

Defending Against Adversarial Attacks by Suppressing the Largest Eigenvalue of Fisher Information Matrix

We propose a scheme for defending against adversarial attacks by suppres...

Please sign up or login with your details

Forgot password? Click here to reset