Towards Adversarial Training with Moderate Performance Improvement for Neural Network Classification
It has been demonstrated that deep neural networks are prone to noisy examples particular adversarial samples during inference process. The gap between robust deep learning systems in real world applications and vulnerable neural networks is still large. Current adversarial training strategies improve the robustness against adversarial samples. However, these methods lead to accuracy reduction when the input examples are clean thus hinders the practicability. In this paper, we investigate an approach that protects the neural network classification from the adversarial samples and improves its accuracy when the input examples are clean. We demonstrate the versatility and effectiveness of our proposed approach on a variety of different networks and datasets.
READ FULL TEXT