Regularity Normalization: Constraining Implicit Space with Minimum Description Length
Inspired by the adaptation phenomenon of biological neuronal firing rate, we propose regularity normalization: a reparametrization of the activation in the neural network that take into account the statistical regularity in the implicit space. The implicit space is constrained by the normalizing factor, the minimum description length of the optimal universal code. We introduce an incremental version of computing this universal code as normalized maximum likelihood and demonstrated its flexibility to include data prior such as top-down attention and other oracle information and its comptatibility to be incorporated into batch normalization and layer normalization. The preliminary empirical results are inconclusive regarding its advantages in tackling the limited and imbalanced data from a non-stationary distribution benchmarked on computer vision task. This biologically plausible normalization, however, has the potential to deal with other complicated real-world scenarios - further research are proposed to discover these scenarios and explore the behaviors among different variants.
READ FULL TEXT