Adversarial robustness against multiple l_p-threat models at the price of one and how to quickly fine-tune robust models to another threat model

05/26/2021

∙

Adversarial training (AT) in order to achieve adversarial robustness wrt single l_p-threat models has been discussed extensively. However, for safety-critical systems adversarial robustness should be achieved wrt all l_p-threat models simultaneously. In this paper we develop a simple and efficient training scheme to achieve adversarial robustness against the union of l_p-threat models. Our novel l_1+l_∞-AT scheme is based on geometric considerations of the different l_p-balls and costs as much as normal adversarial training against a single l_p-threat model. Moreover, we show that using our l_1+l_∞-AT scheme one can fine-tune with just 3 epochs any l_p-robust model (for p ∈{1,2,∞}) and achieve multiple norm adversarial robustness. In this way we boost the previous state-of-the-art reported for multiple-norm robustness by more than 6% on CIFAR-10 and report up to our knowledge the first ImageNet models with multiple norm robustness. Moreover, we study the general transfer of adversarial robustness between different threat models and in this way boost the previous SOTA l_1-robustness on CIFAR-10 by almost 10%.

READ FULL TEXT

Adversarial robustness against multiple l_p-threat models at the price of one and how to quickly fine-tune robust models to another threat model

Sign in with Google

Consider DeepAI Pro