
Benchmarking Semisupervised Federated Learning
Federated learning promises to use the computational power of edge devic...
read it

ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
We introduce AdaHessian, a second order stochastic optimization algorith...
read it

Rethinking Batch Normalization in Transformers
The standard normalization method for neural network (NN) models used in...
read it

ZeroQ: A Novel Zero Shot Quantization Framework
Quantization is a promising approach for reducing the inference time and...
read it

PyHessian: Neural Networks Through the Lens of the Hessian
We present PyHessian, a new scalable framework that enables fast computa...
read it

HAWQV2: Hessian Aware traceWeighted Quantization of Neural Networks
Quantization is an effective method for reducing memory footprint and in...
read it

QBERT: Hessian Based Ultra Low Precision Quantization of BERT
Transformer based architectures have become defacto models used for a r...
read it

ANODEV2: A Coupled Neural ODE Evolution Framework
It has been observed that residual networks can be viewed as the explici...
read it

Residual Networks as Nonlinear Systems: Stability Analysis using Linearization
We regard pretrained residual networks (ResNets) as nonlinear systems a...
read it

JumpReLU: A Retrofit Defense Strategy for Adversarial Attacks
It has been demonstrated that very simple attacks can fool highlysophis...
read it

Inefficiency of KFAC for Large Batch Size Training
In stochastic optimization, large batch training can leverage parallel r...
read it

Shallow Learning for Fluid Flow Reconstruction with Limited Sensors and Limited Data
In many applications, it is important to reconstruct a fluid flow field,...
read it

Trust Region Based Adversarial Attack on Neural Networks
Deep Neural Networks are quite vulnerable to adversarial perturbations. ...
read it

Parameter ReInitialization through Cyclical Batch Size Schedules
Optimal parameter initialization remains a crucial problem for neural ne...
read it

On the Computational Inefficiency of Large Batch Sizes for Stochastic Gradient Descent
Increasing the minibatch size for stochastic gradient descent offers si...
read it

Large batch size training of neural networks with adversarial training and secondorder information
Stochastic Gradient Descent (SGD) methods using randomly selected batche...
read it

Hessianbased Analysis of Large Batch Training and Robustness to Adversaries
Large batch size training of Neural Networks has been shown to incur acc...
read it
Zhewei Yao
is this you? claim profile