
Task Agnostic Continual Learning Using Online Variational Bayes with FixedPoint Updates
Background: Catastrophic forgetting is the notorious vulnerability of ne...
read it

Neural gradients are lognormally distributed: understanding sparse and quantized training
Neural gradient compression remains a main bottleneck in improving train...
read it

The Knowledge Within: Methods for DataFree Model Compression
Background: Recently, an extensive amount of research has been focused o...
read it

At Stability's Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks?
Background: Recent developments have made it possible to accelerate neur...
read it

Augment your batch: better training with larger batches
Largebatch SGD is important for scaling training of deep neural network...
read it

ACIQ: Analytical Clipping for Integer Quantization of neural networks
Unlike traditional approaches that focus on the quantization at the netw...
read it

Scalable Methods for 8bit Training of Neural Networks
Quantized Neural Networks (QNNs) are often used to improve network effic...
read it

Bayesian Gradient Descent: Online Variational Bayes Learning with Increased Robustness to Catastrophic Forgetting and Weight Pruning
We suggest a novel approach for the estimation of the posterior distribu...
read it

Norm matters: efficient and accurate normalization schemes in deep networks
Over the past few years batchnormalization has been commonly used in de...
read it

On the Blindspots of Convolutional Networks
Deep convolutional network has been the stateoftheart approach for a ...
read it

Fix your classifier: the marginal value of training the last weight layer
Neural networks are commonly used as models for classification for a wid...
read it

The Implicit Bias of Gradient Descent on Separable Data
We show that gradient descent on an unregularized logistic regression pr...
read it

Train longer, generalize better: closing the generalization gap in large batch training of neural networks
Background: Deep learning models are typically trained using stochastic ...
read it

Exponentially vanishing suboptimal local minima in multilayer neural networks
Background: Statistical mechanics results (Dauphin et al. (2014); Chorom...
read it

Deep unsupervised learning through spatial contrasting
Convolutional networks have marked their place over the last few years a...
read it

Deep metric learning using Triplet network
Deep learning has proven itself as a successful set of models for learni...
read it
Elad Hoffer
is this you? claim profile