Defending against substitute model black box adversarial attacks with the 01 loss

09/01/2020
by   Yunzhe Xue, et al.
0

Substitute model black box attacks can create adversarial examples for a target model just by accessing its output labels. This poses a major challenge to machine learning models in practice, particularly in security sensitive applications. The 01 loss model is known to be more robust to outliers and noise than convex models that are typically used in practice. Motivated by these properties we present 01 loss linear and 01 loss dual layer neural network models as a defense against transfer based substitute model black box attacks. We compare the accuracy of adversarial examples from substitute model black box attacks targeting our 01 loss models and their convex counterparts for binary classification on popular image benchmarks. Our 01 loss dual layer neural network has an adversarial accuracy of 66.2 MNIST, CIFAR10, STL10, and ImageNet respectively whereas the sigmoid activated logistic loss counterpart has accuracies of 63.5 Except for MNIST the convex counterparts have substantially lower adversarial accuracies. We show practical applications of our models to deter traffic sign and facial recognition adversarial attacks. On GTSRB street sign and CelebA facial detection our 01 loss network has 34.6 respectively whereas the convex logistic counterpart has accuracy 24 Finally we show that our 01 loss network can attain robustness on par with simple convolutional neural networks and much higher than its convex counterpart even when attacked with a convolutional network substitute model. Our work shows that 01 loss models offer a powerful defense against substitute model black box attacks.

READ FULL TEXT
research
02/09/2020

Robust binary classification with the 01 loss

The 01 loss is robust to outliers and tolerant to noisy data compared to...
research
08/20/2020

Towards adversarial robustness with 01 loss neural networks

Motivated by the general robustness properties of the 01 loss we propose...
research
06/14/2020

On the transferability of adversarial examples between convex and 01 loss models

We show that white box adversarial examples do not transfer effectively ...
research
03/24/2023

Effective black box adversarial attack with handcrafted kernels

We propose a new, simple framework for crafting adversarial examples for...
research
03/16/2018

Adversarial Logit Pairing

In this paper, we develop improved techniques for defending against adve...
research
05/28/2019

ME-Net: Towards Effective Adversarial Robustness with Matrix Estimation

Deep neural networks are vulnerable to adversarial attacks. The literatu...
research
06/15/2018

Random depthwise signed convolutional neural networks

Random weights in convolutional neural networks have shown promising res...

Please sign up or login with your details

Forgot password? Click here to reset