Detecting Potential Local Adversarial Examples for Human-Interpretable Defense

09/07/2018
by   Xavier Renard, et al.
0

Machine learning models are increasingly used in the industry to make decisions such as credit insurance approval. Some people may be tempted to manipulate specific variables, such as the age or the salary, in order to get better chances of approval. In this ongoing work, we propose to discuss, with a first proposition, the issue of detecting a potential local adversarial example on classical tabular data by providing to a human expert the locally critical features for the classifier's decision, in order to control the provided information and avoid a fraud.

READ FULL TEXT
research
11/13/2019

Adversarial Examples in Modern Machine Learning: A Review

Recent research has found that many families of machine learning models ...
research
07/30/2019

Not All Adversarial Examples Require a Complex Defense: Identifying Over-optimized Adversarial Examples with IQR-based Logit Thresholding

Detecting adversarial examples currently stands as one of the biggest ch...
research
07/10/2020

Improved Detection of Adversarial Images Using Deep Neural Networks

Machine learning techniques are immensely deployed in both industry and ...
research
07/05/2021

When and How to Fool Explainable Models (and Humans) with Adversarial Examples

Reliable deployment of machine learning models such as neural networks c...
research
04/23/2020

Adversarial Machine Learning: An Interpretation Perspective

Recent years have witnessed the significant advances of machine learning...
research
04/23/2018

VectorDefense: Vectorization as a Defense to Adversarial Examples

Training deep neural networks on images represented as grids of pixels h...
research
07/26/2021

Benign Adversarial Attack: Tricking Algorithm for Goodness

In spite of the successful application in many fields, machine learning ...

Please sign up or login with your details

Forgot password? Click here to reset