Hindering Adversarial Attacks with Implicit Neural Representations

10/22/2022
by   Andrei A. Rusu, et al.
0

We introduce the Lossy Implicit Network Activation Coding (LINAC) defence, an input transformation which successfully hinders several common adversarial attacks on CIFAR-10 classifiers for perturbations up to ϵ = 8/255 in L_∞ norm and ϵ = 0.5 in L_2 norm. Implicit neural representations are used to approximately encode pixel colour intensities in 2D images such that classifiers trained on transformed data appear to have robustness to small perturbations without adversarial training or large drops in performance. The seed of the random number generator used to initialise and train the implicit neural representation turns out to be necessary information for stronger generic attacks, suggesting its role as a private key. We devise a Parametric Bypass Approximation (PBA) attack strategy for key-based defences, which successfully invalidates an existing method in this category. Interestingly, our LINAC defence also hinders some transfer and adaptive attacks, including our novel PBA strategy. Our results emphasise the importance of a broad range of customised attacks despite apparent robustness according to standard evaluations. LINAC source code and parameters of defended classifier evaluated throughout this submission are available: https://github.com/deepmind/linac

READ FULL TEXT

page 8

page 20

page 21

page 22

page 23

page 24

page 25

research
03/15/2020

Output Diversified Initialization for Adversarial Attacks

Adversarial examples are often constructed by iteratively refining a ran...
research
01/04/2023

Beckman Defense

Optimal transport (OT) based distributional robust optimisation (DRO) ha...
research
09/07/2023

DiffDefense: Defending against Adversarial Attacks via Diffusion Models

This paper presents a novel reconstruction method that leverages Diffusi...
research
10/17/2020

A Stochastic Neural Network for Attack-Agnostic Adversarial Robustness

Stochastic Neural Networks (SNNs) that inject noise into their hidden la...
research
03/31/2022

Towards Robust Rain Removal Against Adversarial Attacks: A Comprehensive Benchmark Analysis and Beyond

Rain removal aims to remove rain streaks from images/videos and reduce t...
research
07/12/2020

Probabilistic Jacobian-based Saliency Maps Attacks

Machine learning models have achieved spectacular performances in variou...
research
05/24/2023

Relating Implicit Bias and Adversarial Attacks through Intrinsic Dimension

Despite their impressive performance in classification, neural networks ...

Please sign up or login with your details

Forgot password? Click here to reset