
HAWQV3: Dyadic Neural Network Quantization
Quantization is one of the key techniques used to make Neural Networks (...
A Statistical Framework for Lowbitwidth Training of Deep Neural Networks
Fully quantized training (FQT), which uses lowbitwidth hardware by quan...
MAF: Multimodal Alignment Framework for WeaklySupervised Phrase Grounding
Phrase localization is a task that studies the mapping from textual phra...
Benchmarking Semisupervised Federated Learning
Federated learning promises to use the computational power of edge devic...
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
We introduce AdaHessian, a second order stochastic optimization algorith...
Rethinking Batch Normalization in Transformers
The standard normalization method for neural network (NN) models used in...
ZeroQ: A Novel Zero Shot Quantization Framework
Quantization is a promising approach for reducing the inference time and...
PyHessian: Neural Networks Through the Lens of the Hessian
We present PyHessian, a new scalable framework that enables fast computa...
HAWQV2: Hessian Aware traceWeighted Quantization of Neural Networks
Quantization is an effective method for reducing memory footprint and in...
QBERT: Hessian Based Ultra Low Precision Quantization of BERT
Transformer based architectures have become defacto models used for a r...
ANODEV2: A Coupled Neural ODE Evolution Framework
It has been observed that residual networks can be viewed as the explici...
Residual Networks as Nonlinear Systems: Stability Analysis using Linearization
We regard pretrained residual networks (ResNets) as nonlinear systems a...
JumpReLU: A Retrofit Defense Strategy for Adversarial Attacks
It has been demonstrated that very simple attacks can fool highlysophis...
Inefficiency of KFAC for Large Batch Size Training
In stochastic optimization, large batch training can leverage parallel r...
Shallow Learning for Fluid Flow Reconstruction with Limited Sensors and Limited Data
In many applications, it is important to reconstruct a fluid flow field,...
Trust Region Based Adversarial Attack on Neural Networks
Deep Neural Networks are quite vulnerable to adversarial perturbations. ...
Parameter ReInitialization through Cyclical Batch Size Schedules
Optimal parameter initialization remains a crucial problem for neural ne...
On the Computational Inefficiency of Large Batch Sizes for Stochastic Gradient Descent
Increasing the minibatch size for stochastic gradient descent offers si...
Large batch size training of neural networks with adversarial training and secondorder information
Stochastic Gradient Descent (SGD) methods using randomly selected batche...
Hessianbased Analysis of Large Batch Training and Robustness to Adversaries
Large batch size training of Neural Networks has been shown to incur acc...
Zhewei Yao
