Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting

by   Zeke Xie, et al.

Deep learning is often criticized by two serious issues which rarely exist in natural nervous systems: overfitting and catastrophic forgetting. It can even memorize randomly labelled data, which has little knowledge behind the instance-label pairs. When a deep network continually learns over time by accommodating new tasks, it usually quickly overwrites the knowledge learned from previous tasks. Referred to as the neural variability, it is well-known in neuroscience that human brain reactions exhibit substantial variability even in response to the same stimulus. This mechanism balances accuracy and plasticity/flexibility in the motor learning of natural nervous systems. Thus it motivates us to design a similar mechanism named artificial neural variability (ANV), which helps artificial neural networks learn some advantages from "natural" neural networks. We rigorously prove that ANV plays as an implicit regularizer of the mutual information between the training data and the learned model. This result theoretically guarantees ANV a strictly improved generalizability, robustness to label noise, and robustness to catastrophic forgetting. We then devise a neural variable risk minimization (NVRM) framework and neural variable optimizers to achieve ANV for conventional network architectures in practice. The empirical studies demonstrate that NVRM can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs.


page 1

page 2

page 3

page 4


Synaptic Metaplasticity in Binarized Neural Networks

While deep neural networks have surpassed human performance in multiple ...

Alleviating catastrophic forgetting using context-dependent gating and synaptic stabilization

Humans and most animals can learn new tasks without forgetting old ones....

Pupil Learning Mechanism

Studies on artificial neural networks rarely address both vanishing grad...

Utilizing Priming to Identify Optimal Class Ordering to Alleviate Catastrophic Forgetting

In order for artificial neural networks to begin accurately mimicking bi...

Supervised Generative Reconstruction: An Efficient Way To Flexibly Store and Recognize Patterns

Matching animal-like flexibility in recognition and the ability to quick...

Detecting Information Relays in Deep Neural Networks

Deep learning of artificial neural networks (ANNs) is creating highly fu...

Variability of Artificial Neural Networks

What makes an artificial neural network easier to train and more likely ...

Please sign up or login with your details

Forgot password? Click here to reset