What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness?

10/11/2022
by   Nikolaos Tsilivis, et al.
0

The adversarial vulnerability of neural nets, and subsequent techniques to create robust models have attracted significant attention; yet we still lack a full understanding of this phenomenon. Here, we study adversarial examples of trained neural networks through analytical tools afforded by recent theory advances connecting neural networks and kernel methods, namely the Neural Tangent Kernel (NTK), following a growing body of work that leverages the NTK approximation to successfully analyze important deep learning phenomena and design algorithms for new applications. We show how NTKs allow to generate adversarial examples in a “training-free” fashion, and demonstrate that they transfer to fool their finite-width neural net counterparts in the “lazy” regime. We leverage this connection to provide an alternative view on robust and non-robust features, which have been suggested to underlie the adversarial brittleness of neural nets. Specifically, we define and study features induced by the eigendecomposition of the kernel to better understand the role of robust and non-robust features, the reliance on both for standard classification and the robustness-accuracy trade-off. We find that such features are surprisingly consistent across architectures, and that robust features tend to correspond to the largest eigenvalues of the model, and thus are learned early during training. Our framework allows us to identify and visualize non-robust yet useful features. Finally, we shed light on the robustness mechanism underlying adversarial training of neural nets used in practice: quantifying the evolution of the associated empirical NTK, we demonstrate that its dynamics falls much earlier into the “lazy” regime and manifests a much stronger form of the well known bias to prioritize learning features within the top eigenspaces of the kernel, compared to standard training.

READ FULL TEXT

page 2

page 3

page 7

page 9

page 20

page 26

research
10/21/2022

Evolution of Neural Tangent Kernels under Benign and Adversarial Training

Two key challenges facing modern deep learning are mitigating deep netwo...
research
04/26/2022

On Fragile Features and Batch Normalization in Adversarial Training

Modern deep learning architecture utilize batch normalization (BN) to st...
research
12/21/2019

Jacobian Adversarially Regularized Networks for Robustness

Adversarial examples are crafted with imperceptible perturbations with t...
research
05/06/2019

Adversarial Examples Are Not Bugs, They Are Features

Adversarial examples have attracted significant attention in machine lea...
research
09/16/2022

Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a Polynomial Net Study

Neural tangent kernel (NTK) is a powerful tool to analyze training dynam...
research
07/24/2022

Can we achieve robustness from data alone?

Adversarial training and its variants have come to be the prevailing met...
research
05/26/2019

State-Reification Networks: Improving Generalization by Modeling the Distribution of Hidden Representations

Machine learning promises methods that generalize well from finite label...

Please sign up or login with your details

Forgot password? Click here to reset