Deep Latent Defence

10/09/2019
by   Giulio Zizzo, et al.
0

Deep learning methods have shown state of the art performance in a range of tasks from computer vision to natural language processing. However, it is well known that such systems are vulnerable to attackers who craft inputs in order to cause misclassification. The level of perturbation an attacker needs to introduce in order to cause such a misclassification can be extremely small, and often imperceptible. This is of significant security concern, particularly where misclassification can cause harm to humans. We thus propose Deep Latent Defence, an architecture which seeks to combine adversarial training with a detection system. At its core Deep Latent Defence has a adversarially trained neural network. A series of encoders take the intermediate layer representation of data as it passes though the network and project it to a latent space which we use for detecting adversarial samples via a k-nn classifier. We present results using both grey and white box attackers, as well as an adaptive L_∞ bounded attack which was constructed specifically to try and evade our defence. We find that even under the strongest attacker model that we have investigated our defence is able to offer significant defensive benefits.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/29/2022

A3T: Accuracy Aware Adversarial Training

Adversarial training has been empirically shown to be more prone to over...
research
11/26/2020

Regularization with Latent Space Virtual Adversarial Training

Virtual Adversarial Training (VAT) has shown impressive results among re...
research
12/22/2021

An Attention Score Based Attacker for Black-box NLP Classifier

Deep neural networks have a wide range of applications in solving variou...
research
09/05/2018

Bridging machine learning and cryptography in defence against adversarial attacks

In the last decade, deep learning algorithms have become very popular th...
research
05/26/2019

Purifying Adversarial Perturbation with Adversarially Trained Auto-encoders

Machine learning models are vulnerable to adversarial examples. Iterativ...
research
02/22/2020

Non-Intrusive Detection of Adversarial Deep Learning Attacks via Observer Networks

Recent studies have shown that deep learning models are vulnerable to sp...
research
09/22/2021

Gotta catch 'em all: a Multistage Framework for honeypot fingerprinting

Honeypots are decoy systems that lure attackers by presenting them with ...

Please sign up or login with your details

Forgot password? Click here to reset