Be Your Own Neighborhood: Detecting Adversarial Example by the Neighborhood Relations Built on Self-Supervised Learning

08/31/2022
by   Zhiyuan He, et al.
0

Deep Neural Networks (DNNs) have achieved excellent performance in various fields. However, DNNs' vulnerability to Adversarial Examples (AE) hinders their deployments to safety-critical applications. This paper presents a novel AE detection framework, named BEYOND, for trustworthy predictions. BEYOND performs the detection by distinguishing the AE's abnormal relation with its augmented versions, i.e. neighbors, from two prospects: representation similarity and label consistency. An off-the-shelf Self-Supervised Learning (SSL) model is used to extract the representation and predict the label for its highly informative representation capacity compared to supervised learning models. For clean samples, their representations and predictions are closely consistent with their neighbors, whereas those of AEs differ greatly. Furthermore, we explain this observation and show that by leveraging this discrepancy BEYOND can effectively detect AEs. We develop a rigorous justification for the effectiveness of BEYOND. Furthermore, as a plug-and-play model, BEYOND can easily cooperate with the Adversarial Trained Classifier (ATC), achieving the state-of-the-art (SOTA) robustness accuracy. Experimental results show that BEYOND outperforms baselines by a large margin, especially under adaptive attacks. Empowered by the robust relation net built on SSL, we found that BEYOND outperforms baselines in terms of both detection ability and speed. Our code will be publicly available.

READ FULL TEXT

page 2

page 7

research
01/23/2021

Online Adversarial Purification based on Self-Supervision

Deep neural networks are known to be vulnerable to adversarial examples,...
research
02/13/2022

Adversarial Fine-tuning for Backdoor Defense: Connect Adversarial Examples to Triggered Samples

Deep neural networks (DNNs) are known to be vulnerable to backdoor attac...
research
05/15/2019

War: Detecting adversarial examples by pre-processing input data

Deep neural networks (DNNs) have demonstrated their outstanding performa...
research
02/14/2021

Adversarial defense for automatic speaker verification by cascaded self-supervised learning models

Automatic speaker verification (ASV) is one of the core technologies in ...
research
08/19/2022

Self-Supervised Visual Place Recognition by Mining Temporal and Feature Neighborhoods

Visual place recognition (VPR) using deep networks has achieved state-of...
research
06/04/2022

MSR: Making Self-supervised learning Robust to Aggressive Augmentations

Most recent self-supervised learning methods learn visual representation...
research
11/29/2022

Birds of a Feather Trust Together: Knowing When to Trust a Classifier via Adaptive Neighborhood Aggregation

How do we know when the predictions made by a classifier can be trusted?...

Please sign up or login with your details

Forgot password? Click here to reset