Flatten the Curve: Efficiently Training Low-Curvature Neural Networks

06/14/2022
by   Suraj Srinivas, et al.
0

The highly non-linear nature of deep neural networks causes them to be susceptible to adversarial examples and have unstable gradients which hinders interpretability. However, existing methods to solve these issues, such as adversarial training, are expensive and often sacrifice predictive accuracy. In this work, we consider curvature, which is a mathematical quantity which encodes the degree of non-linearity. Using this, we demonstrate low-curvature neural networks (LCNNs) that obtain drastically lower curvature than standard models while exhibiting similar predictive performance, which leads to improved robustness and stable gradients, with only a marginally increased training time. To achieve this, we minimize a data-independent upper bound on the curvature of a neural network, which decomposes overall curvature in terms of curvatures and slopes of its constituent layers. To efficiently minimize this bound, we introduce two novel architectural components: first, a non-linearity called centered-softplus that is a stable variant of the softplus non-linearity, and second, a Lipschitz-constrained batch normalization layer. Our experiments show that LCNNs have lower curvature, more stable gradients and increased off-the-shelf adversarial robustness when compared to their standard high-curvature counterparts, all without affecting predictive performance. Our approach is easy to use and can be readily incorporated into existing neural network models.

READ FULL TEXT
research
10/15/2015

Layer-Specific Adaptive Learning Rates for Deep Networks

The increasing complexity of deep learning architectures is resulting in...
research
06/01/2020

Second-Order Provable Defenses against Adversarial Attacks

A robustness certificate is the minimum distance of a given input to the...
research
06/30/2021

Exploring Robustness of Neural Networks through Graph Measures

Motivated by graph theory, artificial neural networks (ANNs) are traditi...
research
11/22/2019

Bounding Singular Values of Convolution Layers

In deep neural networks, the spectral norm of the Jacobian of a layer bo...
research
02/15/2021

Low Curvature Activations Reduce Overfitting in Adversarial Training

Adversarial training is one of the most effective defenses against adver...
research
12/20/2019

MLRG Deep Curvature

We present MLRG Deep Curvature suite, a PyTorch-based, open-source packa...
research
08/22/2018

Statistical Neurodynamics of Deep Networks: Geometry of Signal Spaces

Statistical neurodynamics studies macroscopic behaviors of randomly conn...

Please sign up or login with your details

Forgot password? Click here to reset