Metrics and methods for robustness evaluation of neural networks with generative models

03/04/2020
by   Igor Buzhinsky, et al.
0

Recent studies have shown that modern deep neural network classifiers are easy to fool, assuming that an adversary is able to slightly modify their inputs. Many papers have proposed adversarial attacks, defenses and methods to measure robustness to such adversarial perturbations. However, most commonly considered adversarial examples are based on ℓ_p-bounded perturbations in the input space of the neural network, which are unlikely to arise naturally. Recently, especially in computer vision, researchers discovered "natural" or "semantic" perturbations, such as rotations, changes of brightness, or more high-level changes, but these perturbations have not yet been systematically utilized to measure the performance of classifiers. In this paper, we propose several metrics to measure robustness of classifiers to natural adversarial examples, and methods to evaluate them. These metrics, called latent space performance metrics, are based on the ability of generative models to capture probability distributions, and are defined in their latent spaces. On three image classification case studies, we evaluate the proposed metrics for several classifiers, including ones trained in conventional and robust ways. We find that the latent counterparts of adversarial robustness are associated with the accuracy of the classifier rather than its conventional adversarial robustness, but the latter is still reflected on the properties of found latent perturbations. In addition, our novel method of finding latent adversarial perturbations demonstrates that these perturbations are often perceptually small.

READ FULL TEXT

page 15

page 20

page 21

page 22

research
02/19/2018

Robustness of Rotation-Equivariant Networks to Adversarial Perturbations

Deep neural networks have been shown to be vulnerable to adversarial exa...
research
08/20/2020

β-Variational Classifiers Under Attack

Deep Neural networks have gained lots of attention in recent years thank...
research
10/02/2017

DeepSafe: A Data-driven Approach for Checking Adversarial Robustness in Neural Networks

Deep neural networks have become widely used, obtaining remarkable resul...
research
11/30/2021

A framework to measure the robustness of programs in the unpredictable environment

Due to the diffusion of IoT, modern software systems are often thought t...
research
09/09/2018

The Curse of Concentration in Robust Learning: Evasion and Poisoning Attacks from Concentration of Measure

Many modern machine learning classifiers are shown to be vulnerable to a...
research
01/29/2020

Semantic Adversarial Perturbations using Learnt Representations

Adversarial examples for image classifiers are typically created by sear...
research
07/26/2018

A general metric for identifying adversarial images

It is well known that a determined adversary can fool a neural network b...

Please sign up or login with your details

Forgot password? Click here to reset