Potential adversarial samples for white-box attacks
Deep convolutional neural networks can be highly vulnerable to small perturbations of their inputs, potentially a major issue or limitation on system robustness when using deep networks as classifiers. In this paper we propose a low-cost method to explore marginal sample data near trained classifier decision boundaries, thus identifying potential adversarial samples. By finding such adversarial samples it is possible to reduce the search space of adversarial attack algorithms while keeping a reasonable successful perturbation rate. In our developed strategy, the potential adversarial samples represent only 61 adversarial samples produced by iFGSM and 92 successfully perturbed by DeepFool on CIFAR10.
READ FULL TEXT