Regularizing activations in neural networks via distribution matching with the Wasserstein metric

02/13/2020
by   Taejong Joo, et al.
0

Regularization and normalization have become indispensable components in training deep neural networks, resulting in faster training and improved generalization performance. We propose the projected error function regularization loss (PER) that encourages activations to follow the standard normal distribution. PER randomly projects activations onto one-dimensional space and computes the regularization loss in the projected space. PER is similar to the Pseudo-Huber loss in the projected space, thus taking advantage of both L^1 and L^2 regularization losses. Besides, PER can capture the interaction between hidden units by projection vector drawn from a unit sphere. By doing so, PER minimizes the upper bound of the Wasserstein distance of order one between an empirical distribution of activations and the standard normal distribution. To the best of the authors' knowledge, this is the first work to regularize activations via distribution matching in the probability distribution space. We evaluate the proposed method on the image classification task and the word-level language modeling task.

READ FULL TEXT
research
11/21/2018

Regularizing by the Variance of the Activations' Sample-Variances

Normalization techniques play an important role in supporting efficient ...
research
02/12/2021

Two-sample Test with Kernel Projected Wasserstein Distance

We develop a kernel projected Wasserstein distance for the two-sample te...
research
10/16/2020

Filtered Batch Normalization

It is a common assumption that the activation of different layers in neu...
research
10/15/2020

Wasserstein Distance Regularized Sequence Representation for Text Matching in Asymmetrical Domains

One approach to matching texts from asymmetrical domains is projecting t...
research
06/05/2021

k-Mixup Regularization for Deep Learning via Optimal Transport

Mixup is a popular regularization technique for training deep neural net...
research
02/26/2022

Neuro-Inspired Deep Neural Networks with Sparse, Strong Activations

While end-to-end training of Deep Neural Networks (DNNs) yields state of...
research
08/15/2019

Using Wasserstein-2 regularization to ensure fair decisions with Neural-Network classifiers

In this paper, we propose a new method to build fair Neural-Network clas...

Please sign up or login with your details

Forgot password? Click here to reset