A Unified Framework for Adversarial Attack and Defense in Constrained Feature Space

12/02/2021
by   Thibault Simonetto, et al.
0

The generation of feasible adversarial examples is necessary for properly assessing models that work on constrained feature space. However, it remains a challenging task to enforce constraints into attacks that were designed for computer vision. We propose a unified framework to generate feasible adversarial examples that satisfy given domain constraints. Our framework supports the use cases reported in the literature and can handle both linear and non-linear constraints. We instantiate our framework into two algorithms: a gradient-based attack that introduces constraints in the loss function to maximize, and a multi-objective search algorithm that aims for misclassification, perturbation minimization, and constraint satisfaction. We show that our approach is effective on two datasets from different domains, with a success rate of up to 100 generate a single feasible example. In addition to adversarial retraining, we propose to introduce engineered non-convex constraints to improve model adversarial robustness. We demonstrate that this new defense is as effective as adversarial retraining. Our framework forms the starting point for research on constrained adversarial attacks and provides relevant baselines and datasets that future research can exploit.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/26/2020

A Unified Framework for Analyzing and Detecting Malicious Examples of DNN Models

Deep Neural Networks are well known to be vulnerable to adversarial atta...
research
05/30/2022

Domain Constraints in Feature Space: Strengthening Robustness of Android Malware Detection against Realizable Adversarial Examples

Strengthening the robustness of machine learning-based malware detectors...
research
05/26/2019

Generalizable Adversarial Attacks Using Generative Models

Adversarial attacks on deep neural networks traditionally rely on a cons...
research
06/02/2020

Perturbation Analysis of Gradient-based Adversarial Attacks

After the discovery of adversarial examples and their adverse effects on...
research
10/13/2021

Identification of Attack-Specific Signatures in Adversarial Examples

The adversarial attack literature contains a myriad of algorithms for cr...
research
11/02/2020

Adversarial Examples in Constrained Domains

Machine learning algorithms have been shown to be vulnerable to adversar...
research
11/05/2019

Intriguing Properties of Adversarial ML Attacks in the Problem Space

Recent research efforts on adversarial ML have investigated problem-spac...

Please sign up or login with your details

Forgot password? Click here to reset