Distribution Mismatch Correction for Improved Robustness in Deep Neural Networks

by   Alexander Fuchs, et al.

Deep neural networks rely heavily on normalization methods to improve their performance and learning behavior. Although normalization methods spurred the development of increasingly deep and efficient architectures, they also increase the vulnerability with respect to noise and input corruptions. In most applications, however, noise is ubiquitous and diverse; this can often lead to complete failure of machine learning systems as they fail to cope with mismatches between the input distribution during training- and test-time. The most common normalization method, batch normalization, reduces the distribution shift during training but is agnostic to changes in the input distribution during test time. This makes batch normalization prone to performance degradation whenever noise is present during test-time. Sample-based normalization methods can correct linear transformations of the activation distribution but cannot mitigate changes in the distribution shape; this makes the network vulnerable to distribution changes that cannot be reflected in the normalization parameters. We propose an unsupervised non-parametric distribution correction method that adapts the activation distribution of each layer. This reduces the mismatch between the training and test-time distribution by minimizing the 1-D Wasserstein distance. In our experiments, we empirically show that the proposed method effectively reduces the impact of intense image corruptions and thus improves the classification performance without the need for retraining or fine-tuning the model.


page 6

page 12


Test-time Batch Normalization

Deep neural networks often suffer the data distribution shift between tr...

TTN: A Domain-Shift Aware Batch Normalization in Test-Time Adaptation

This paper proposes a novel batch normalization strategy for test-time a...

GradNets: Dynamic Interpolation Between Neural Architectures

In machine learning, there is a fundamental trade-off between ease of op...

LocalNorm: Robust Image Classification through Dynamically Regularized Normalization

While modern convolutional neural networks achieve outstanding accuracy ...

Be Like Water: Robustness to Extraneous Variables Via Adaptive Feature Normalization

Extraneous variables are variables that are irrelevant for a certain tas...

Probabilistic Binary Neural Networks

Low bit-width weights and activations are an effective way of combating ...

Simple High Quality OoD Detection with L2 Normalization

We propose a simple modification to standard ResNet architectures during...

Please sign up or login with your details

Forgot password? Click here to reset