Directional Adversarial Training for Cost Sensitive Deep Learning Classification Applications

by   Matteo Terzi, et al.

In many real-world applications of Machine Learning it is of paramount importance not only to provide accurate predictions, but also to ensure certain levels of robustness. Adversarial Training is a training procedure aiming at providing models that are robust to worst-case perturbations around predefined points. Unfortunately, one of the main issues in adversarial training is that robustness w.r.t. gradient-based attackers is always achieved at the cost of prediction accuracy. In this paper, a new algorithm, called Wasserstein Projected Gradient Descent (WPGD), for adversarial training is proposed. WPGD provides a simple way to obtain cost-sensitive robustness, resulting in a finer control of the robustness-accuracy trade-off. Moreover, WPGD solves an optimal transport problem on the output space of the network and it can efficiently discover directions where robustness is required, allowing to control the directional trade-off between accuracy and robustness. The proposed WPGD is validated in this work on image recognition tasks with different benchmark datasets and architectures. Moreover, real world-like datasets are often unbalanced: this paper shows that when dealing with such type of datasets, the performance of adversarial training are mainly affected in term of standard accuracy.


page 4

page 6

page 14


Fooling Adversarial Training with Inducing Noise

Adversarial training is widely believed to be a reliable approach to imp...

Cost-Sensitive Robustness against Adversarial Examples

Several recent works have developed methods for training classifiers tha...

Adversarial Feature Stacking for Accurate and Robust Predictions

Deep Neural Networks (DNNs) have achieved remarkable performance on a va...

Once-for-All Adversarial Training: In-Situ Tradeoff between Robustness and Accuracy for Free

Adversarial training and its many variants substantially improve deep ne...

Inductive Bias of Gradient Descent based Adversarial Training on Separable Data

Adversarial training is a principled approach for training robust neural...

A Framework for Verification of Wasserstein Adversarial Robustness

Machine learning image classifiers are susceptible to adversarial and co...

Towards Deep Learning Models Resistant to Large Perturbations

Adversarial robustness has proven to be a required property of machine l...