On the Intriguing Connections of Regularization, Input Gradients and Transferability of Evasion and Poisoning Attacks

09/08/2018
by   Ambra Demontis, et al.
0

Transferability captures the ability of an attack against a machine-learning model to be effective against a different, potentially unknown, model. Studying transferability of attacks has gained interest in the last years due to the deployment of cyber-attack detection services based on machine learning. For these applications of machine learning, service providers avoid disclosing information about their machine-learning algorithms. As a result, attackers trying to bypass detection are forced to craft their attacks against a surrogate model instead of the actual target model used by the service. While previous work has shown that finding test-time transferable attack samples is possible, it is not well understood how an attacker may construct adversarial examples that are likely to transfer against different models, in particular in the case of training-time poisoning attacks. In this paper, we present the first empirical analysis aimed to investigate the transferability of both test-time evasion and training-time poisoning attacks. We provide a unifying, formal definition of transferability of such attacks and show how it relates to the input gradients of the surrogate and of the target classification models. We assess to which extent some of the most well-known machine-learning systems are vulnerable to transfer attacks, and explain why such attacks succeed (or not) across different models. To this end, we leverage some interesting connections highlighted in this work among the adversarial vulnerability of machine-learning models, their regularization hyperparameters and input gradients.

READ FULL TEXT
research
12/03/2021

Attack-Centric Approach for Evaluating Transferability of Adversarial Samples in Machine Learning Models

Transferability of adversarial samples became a serious concern due to t...
research
06/14/2021

Backdoor Learning Curves: Explaining Backdoor Poisoning Beyond Influence Functions

Backdoor attacks inject poisoning samples during training, with the goal...
research
09/17/2021

Messing Up 3D Virtual Environments: Transferable Adversarial 3D Objects

In the last few years, the scientific community showed a remarkable and ...
research
06/27/2023

Your Attack Is Too DUMB: Formalizing Attacker Scenarios for Adversarial Transferability

Evasion attacks are a threat to machine learning models, where adversari...
research
03/19/2018

When Does Machine Learning FAIL? Generalized Transferability for Evasion and Poisoning Attacks

Attacks against machine learning systems represent a growing threat as h...
research
11/22/2021

NTD: Non-Transferability Enabled Backdoor Detection

A backdoor deep learning (DL) model behaves normally upon clean inputs b...
research
03/14/2022

Energy-Latency Attacks via Sponge Poisoning

Sponge examples are test-time inputs carefully-optimized to increase ene...

Please sign up or login with your details

Forgot password? Click here to reset