ATRO: Adversarial Training with a Rejection Option

10/24/2020
by   Masahiro Kato, et al.
0

This paper proposes a classification framework with a rejection option to mitigate the performance deterioration caused by adversarial examples. While recent machine learning algorithms achieve high prediction performance, they are empirically vulnerable to adversarial examples, which are slightly perturbed data samples that are wrongly classified. In real-world applications, adversarial attacks using such adversarial examples could cause serious problems. To this end, various methods are proposed to obtain a classifier that is robust against adversarial examples. Adversarial training is one of them, which trains a classifier to minimize the worst-case loss under adversarial attacks. In this paper, in order to acquire a more reliable classifier against adversarial attacks, we propose the method of Adversarial Training with a Rejection Option (ATRO). Applying the adversarial training objective to both a classifier and a rejection function simultaneously, classifiers trained by ATRO can choose to abstain from classification when it has insufficient confidence to classify a test data point. We examine the feasibility of the framework using the surrogate maximum hinge loss and establish a generalization bound for linear models. Furthermore, we empirically confirmed the effectiveness of ATRO using various models and real-world datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/27/2017

A Multi-strength Adversarial Training Method to Mitigate Adversarial Attacks

Some recent works revealed that deep neural networks (DNNs) are vulnerab...
research
11/25/2019

Playing it Safe: Adversarial Robustness with an Abstain Option

We explore adversarial robustness in the setting in which it is acceptab...
research
11/20/2019

Deep Minimax Probability Machine

Deep neural networks enjoy a powerful representation and have proven eff...
research
01/25/2023

A Data-Centric Approach for Improving Adversarial Training Through the Lens of Out-of-Distribution Detection

Current machine learning models achieve super-human performance in many ...
research
08/19/2022

A Novel Plug-and-Play Approach for Adversarially Robust Generalization

In this work, we propose a robust framework that employs adversarially r...
research
02/16/2019

Mitigation of Adversarial Examples in RF Deep Classifiers Utilizing AutoEncoder Pre-training

Adversarial examples in machine learning for images are widely publicize...
research
09/12/2023

Using Reed-Muller Codes for Classification with Rejection and Recovery

When deploying classifiers in the real world, users expect them to respo...

Please sign up or login with your details

Forgot password? Click here to reset