Visualizing Representations of Adversarially Perturbed Inputs

05/28/2021
by   Daniel Steinberg, et al.
0

It has been shown that deep learning models are vulnerable to adversarial attacks. We seek to further understand the consequence of such attacks on the intermediate activations of neural networks. We present an evaluation metric, POP-N, which scores the effectiveness of projecting data to N dimensions under the context of visualizing representations of adversarially perturbed inputs. We conduct experiments on CIFAR-10 to compare the POP-2 score of several dimensionality reduction algorithms across various adversarial attacks. Finally, we utilize the 2D data corresponding to high POP-2 scores to generate example visualizations.

READ FULL TEXT
research
11/07/2022

Deviations in Representations Induced by Adversarial Attacks

Deep learning has been a popular topic and has achieved success in many ...
research
06/26/2019

Defending Adversarial Attacks by Correcting logits

Generating and eliminating adversarial examples has been an intriguing t...
research
05/31/2023

Graph-based methods coupled with specific distributional distances for adversarial attack detection

Artificial neural networks are prone to being fooled by carefully pertur...
research
06/18/2020

The Dilemma Between Dimensionality Reduction and Adversarial Robustness

Recent work has shown the tremendous vulnerability to adversarial sample...
research
01/17/2021

Adversarial Attacks On Multi-Agent Communication

Growing at a very fast pace, modern autonomous systems will soon be depl...
research
05/26/2021

Intriguing Parameters of Structural Causal Models

In recent years there has been a lot of focus on adversarial attacks, es...

Please sign up or login with your details

Forgot password? Click here to reset