An Information-Theoretic Explanation for the Adversarial Fragility of AI Classifiers

01/27/2019
by   Hui Xie, et al.
0

We present a simple hypothesis about a compression property of artificial intelligence (AI) classifiers and present theoretical arguments to show that this hypothesis successfully accounts for the observed fragility of AI classifiers to small adversarial perturbations. We also propose a new method for detecting when small input perturbations cause classifier errors, and show theoretical guarantees for the performance of this detection method. We present experimental results with a voice recognition system to demonstrate this method. The ideas in this paper are motivated by a simple analogy between AI classifiers and the standard Shannon model of a communication system.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/02/2019

Adversarial Robustness May Be at Odds With Simplicity

Current techniques in machine learning are so far are unable to learn cl...
research
02/09/2015

Analysis of classifiers' robustness to adversarial perturbations

The goal of this paper is to analyze an intriguing phenomenon recently d...
research
07/28/2020

Derivation of Information-Theoretically Optimal Adversarial Attacks with Applications to Robust Machine Learning

We consider the theoretical problem of designing an optimal adversarial ...
research
04/30/2018

Adversarially Robust Generalization Requires More Data

Machine learning models are often susceptible to adversarial perturbatio...
research
10/12/2018

Fast Construction of Correcting Ensembles for Legacy Artificial Intelligence Systems: Algorithms and a Case Study

This paper presents a technology for simple and computationally efficien...
research
12/07/2021

Image classifiers can not be made robust to small perturbations

The sensitivity of image classifiers to small perturbations in the input...

Please sign up or login with your details

Forgot password? Click here to reset