On Success and Simplicity: A Second Look at Transferable Targeted Attacks
There is broad consensus among researchers studying adversarial examples that it is extremely difficult to achieve transferability of targeted attacks. Currently, existing research strives for transferability of targeted attacks by resorting to sophisticated losses and even massive training. In this paper, we take a second look at the transferability of targeted attacks and show that their difficulty has been overestimated due to a blind spot in the conventional evaluation procedures. Specifically, current work has unreasonably restricted attack optimization to a few iterations. Here, we show that targeted attacks converge slowly to optimal transferability and improve considerably when given more iterations. We also demonstrate that an attack that simply maximizes the target logit performs surprisingly well, remarkably surpassing more complex losses and even achieving performance comparable to the state of the art, which requires massive training with sophisticated loss. We provide further validation of our logit attack in a realistic ensemble setting and in a real-world attack against the Google Cloud Vision. The logit attack produces perturbations that reflect the target semantics, which we demonstrate allows us to create targeted universal adversarial perturbations without additional training images.
READ FULL TEXT