-
Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization
The early phase of training has been shown to be important in two ways f...
read it
-
Advantages of biologically-inspired adaptive neural activation in RNNs during learning
Dynamic adaptation in single-neuron response plays a fundamental role in...
read it
-
Untangling tradeoffs between recurrence and self-attention in neural networks
Attention and self-attention mechanisms, inspired by cognitive processes...
read it
-
Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics
A recent strategy to circumvent the exploding and vanishing gradient pro...
read it
-
h-detach: Modifying the LSTM Gradient Towards Better Optimization
Recurrent neural networks are known for their notorious exploding and va...
read it

Giancarlo Kerg
is this you? claim profile