
PhysicsAware Downsampling with Deep Learning for Scalable Flood Modeling
Background: Floods are the most common natural disaster in the world, af...
read it

Statistical Testing for Efficient Out of Distribution Detection in Deep Neural Networks
Commonly, Deep Neural Networks (DNNs) generalize well on samples drawn f...
read it

On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent
Recent work has highlighted the role of initialization scale in determin...
read it

Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks
Recently, researchers proposed pruning deep neural network weights (DNNs...
read it

Task Agnostic Continual Learning Using Online Variational Bayes with FixedPoint Updates
Background: Catastrophic forgetting is the notorious vulnerability of ne...
read it

Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy
We provide a detailed asymptotic study of gradient flow trajectories and...
read it

Beyond Signal Propagation: Is Feature Diversity Necessary in Deep Neural Network Initialization?
Deep neural networks are typically initialized with random weights, with...
read it

Neural gradients are lognormally distributed: understanding sparse and quantized training
Neural gradient compression remains a main bottleneck in improving train...
read it

Kernel and Rich Regimes in Overparametrized Models
A recent line of work studies overparametrized neural networks in the "k...
read it

MTJBased Hardware Synapse Design for Quantized Deep Neural Networks
Quantized neural networks (QNNs) are being actively researched as a solu...
read it

Is Feature Diversity Necessary in Neural Network Initialization?
Standard practice in training neural networks involves initializing the ...
read it

The Knowledge Within: Methods for DataFree Model Compression
Background: Recently, an extensive amount of research has been focused o...
read it

A Function Space View of Bounded Norm Infinite Width ReLU Nets: The Multivariate Case
A key element of understanding the efficacy of overparameterized neural ...
read it

At Stability's Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks?
Background: Recent developments have made it possible to accelerate neur...
read it

Kernel and Deep Regimes in Overparametrized Models
A recent line of work studies overparametrized neural networks in the "k...
read it

A Mean Field Theory of Quantized Deep Networks: The QuantizationDepth TradeOff
Reducing the precision of weights and activation functions in neural net...
read it

Lexicographic and DepthSensitive Margins in Homogeneous and NonHomogeneous Deep Models
With an eye toward understanding complexity control in deep learning, we...
read it

How do infinite width bounded norm networks look in function space?
We consider the question of what functions can be captured by ReLU netwo...
read it

Augment your batch: better training with larger batches
Largebatch SGD is important for scaling training of deep neural network...
read it

ACIQ: Analytical Clipping for Integer Quantization of neural networks
Unlike traditional approaches that focus on the quantization at the netw...
read it

Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate
Stochastic Gradient Descent (SGD) is a central tool in machine learning....
read it

Implicit Bias of Gradient Descent on Linear Convolutional Networks
We show that gradient descent on fullwidth linear convolutional network...
read it

Scalable Methods for 8bit Training of Neural Networks
Quantized Neural Networks (QNNs) are often used to improve network effic...
read it

The Global Optimization Geometry of Shallow Linear Neural Networks
We examine the squared error loss landscape of shallow linear neural net...
read it

Bayesian Gradient Descent: Online Variational Bayes Learning with Increased Robustness to Catastrophic Forgetting and Weight Pruning
We suggest a novel approach for the estimation of the posterior distribu...
read it

Convergence of Gradient Descent on Separable Data
The implicit bias of gradient descent is not fully understood even in si...
read it

Norm matters: efficient and accurate normalization schemes in deep networks
Over the past few years batchnormalization has been commonly used in de...
read it

Characterizing Implicit Bias in Terms of Optimization Geometry
We study the bias of generic optimization methods, including Mirror Desc...
read it

On the Blindspots of Convolutional Networks
Deep convolutional network has been the stateoftheart approach for a ...
read it

Fix your classifier: the marginal value of training the last weight layer
Neural networks are commonly used as models for classification for a wid...
read it

The Implicit Bias of Gradient Descent on Separable Data
We show that gradient descent on an unregularized logistic regression pr...
read it

Train longer, generalize better: closing the generalization gap in large batch training of neural networks
Background: Deep learning models are typically trained using stochastic ...
read it

Exponentially vanishing suboptimal local minima in multilayer neural networks
Background: Statistical mechanics results (Dauphin et al. (2014); Chorom...
read it

Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations
We introduce a method to train Quantized Neural Networks (QNNs)  neur...
read it

No bad local minima: Data independent training error guarantees for multilayer neural networks
We use smoothed analysis techniques to provide guarantees on the trainin...
read it

Binarized Neural Networks
We introduce a method to train Binarized Neural Networks (BNNs)  neural...
read it

Training Binary Multilayer Neural Networks for Image Classification using Expectation Backpropagation
Compared to Multilayer Neural Networks with real weights, Binary Multila...
read it

Mean Field Bayes Backpropagation: scalable training of multilayer neural networks with binary weights
Significant success has been reported recently using deep neural network...
read it
Daniel Soudry
is this you? claim profile
Assistant Professor at Technion  Israel Institute of Technology