Distribution Mismatch Correction for Improved Robustness in Deep Neural Networks

by   Alexander Fuchs, et al.

Deep neural networks rely heavily on normalization methods to improve their performance and learning behavior. Although normalization methods spurred the development of increasingly deep and efficient architectures, they also increase the vulnerability with respect to noise and input corruptions. In most applications, however, noise is ubiquitous and diverse; this can often lead to complete failure of machine learning systems as they fail to cope with mismatches between the input distribution during training- and test-time. The most common normalization method, batch normalization, reduces the distribution shift during training but is agnostic to changes in the input distribution during test time. This makes batch normalization prone to performance degradation whenever noise is present during test-time. Sample-based normalization methods can correct linear transformations of the activation distribution but cannot mitigate changes in the distribution shape; this makes the network vulnerable to distribution changes that cannot be reflected in the normalization parameters. We propose an unsupervised non-parametric distribution correction method that adapts the activation distribution of each layer. This reduces the mismatch between the training and test-time distribution by minimizing the 1-D Wasserstein distance. In our experiments, we empirically show that the proposed method effectively reduces the impact of intense image corruptions and thus improves the classification performance without the need for retraining or fine-tuning the model.



page 6

page 12


Test-time Batch Normalization

Deep neural networks often suffer the data distribution shift between tr...

Test-time Batch Statistics Calibration for Covariate Shift

Deep neural networks have a clear degradation when applying to the unsee...

GradNets: Dynamic Interpolation Between Neural Architectures

In machine learning, there is a fundamental trade-off between ease of op...

LocalNorm: Robust Image Classification through Dynamically Regularized Normalization

While modern convolutional neural networks achieve outstanding accuracy ...

Be Like Water: Robustness to Extraneous Variables Via Adaptive Feature Normalization

Extraneous variables are variables that are irrelevant for a certain tas...

Inherent Weight Normalization in Stochastic Neural Networks

Multiplicative stochasticity such as Dropout improves the robustness and...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.