Amir Gholami

research

∙ 06/13/2023

SqueezeLLM: Dense-and-Sparse Quantization

Generative Large Language Models (LLMs) have demonstrated remarkable res...

0 Sehoon Kim, et al. ∙

research

∙ 06/01/2023

Towards Foundation Models for Scientific Machine Learning: Characterizing Scaling and Transfer Behavior

Pre-trained machine learning (ML) models have shown great performance fo...

0 Shashank Subramanian, et al. ∙

research

∙ 02/27/2023

Full Stack Optimization of Transformer Inference: a Survey

Recent advances in state-of-the-art DNN architecture design have been mo...

0 Sehoon Kim, et al. ∙

research

∙ 02/15/2023

Big Little Transformer Decoder

The recent emergence of Large Language Models based on the Transformer a...

0 Sehoon Kim, et al. ∙

research

∙ 07/08/2022

Adaptive Self-supervision Algorithms for Physics-informed Neural Networks

Physics-informed neural networks (PINNs) incorporate physical knowledge ...

8 Shashank Subramanian, et al. ∙

research

∙ 06/02/2022

Squeezeformer: An Efficient Transformer for Automatic Speech Recognition

The recently proposed Conformer model has become the de facto backbone m...

29 Sehoon Kim, et al. ∙

research

∙ 09/02/2021

Characterizing possible failure modes in physics-informed neural networks

Recent work in scientific machine learning has developed so-called physi...

6 Aditi S. Krishnapriyan, et al. ∙

research

∙ 07/02/2021

Learned Token Pruning for Transformers

A major challenge in deploying transformer models is their prohibitive i...

8 Sehoon Kim, et al. ∙

research

∙ 03/31/2021

Q-ASR: Integer-only Zero-shot Quantization for Efficient Speech Recognition

End-to-end neural network models achieve improved performance on various...

0 Sehoon Kim, et al. ∙

research

∙ 03/25/2021

A Survey of Quantization Methods for Efficient Neural Network Inference

As soon as abstract mathematical computations were adapted to computatio...

10 Amir Gholami, et al. ∙

research

∙ 01/22/2021

Hessian-Aware Pruning and Optimal Neural Implant

Pruning is an effective method to reduce the memory footprint and FLOPs ...

1 Shixing Yu, et al. ∙

research

∙ 01/05/2021

I-BERT: Integer-only BERT Quantization

Transformer based models, like BERT and RoBERTa, have achieved state-of-...

0 Sehoon Kim, et al. ∙

research

∙ 11/20/2020

HAWQV3: Dyadic Neural Network Quantization

Quantization is one of the key techniques used to make Neural Networks (...

7 Zhewei Yao, et al. ∙

research

∙ 07/09/2020

Boundary thickness and robustness in learning models

Robustness of machine learning models to various adversarial and non-adv...

14 Yaoqing Yang, et al. ∙

research

∙ 06/01/2020

ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning

We introduce AdaHessian, a second order stochastic optimization algorith...

9 Zhewei Yao, et al. ∙

research

∙ 03/17/2020

Rethinking Batch Normalization in Transformers

The standard normalization method for neural network (NN) models used in...

0 Sheng Shen, et al. ∙

research

∙ 01/01/2020

ZeroQ: A Novel Zero Shot Quantization Framework

Quantization is a promising approach for reducing the inference time and...

12 Yaohui Cai, et al. ∙

research

∙ 12/16/2019

PyHessian: Neural Networks Through the Lens of the Hessian

We present PyHessian, a new scalable framework that enables fast computa...

0 Zhewei Yao, et al. ∙

research

∙ 11/10/2019

HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks

Quantization is an effective method for reducing memory footprint and in...

19 Zhen Dong, et al. ∙

research

∙ 10/07/2019

Checkmate: Breaking the Memory Wall with Optimal Tensor Rematerialization

Modern neural networks are increasingly bottlenecked by the limited capa...

12 Paras Jain, et al. ∙

research

∙ 09/12/2019

Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT

Transformer based architectures have become de-facto models used for a r...

0 Sheng Shen, et al. ∙

research

∙ 06/10/2019

ANODEV2: A Coupled Neural ODE Evolution Framework

It has been observed that residual networks can be viewed as the explici...

3 Tianjun Zhang, et al. ∙

research

∙ 03/14/2019

Inefficiency of K-FAC for Large Batch Size Training

In stochastic optimization, large batch training can leverage parallel r...

10 Linjian Ma, et al. ∙

research

∙ 02/27/2019

ANODE: Unconditionally Accurate Memory-Efficient Gradients for Neural ODEs

Residual neural networks can be viewed as the forward Euler discretizati...

0 Amir Gholami, et al. ∙

research

∙ 12/16/2018

Trust Region Based Adversarial Attack on Neural Networks

Deep Neural Networks are quite vulnerable to adversarial perturbations. ...

0 Zhewei Yao, et al. ∙

research

∙ 12/04/2018

Parameter Re-Initialization through Cyclical Batch Size Schedules

Optimal parameter initialization remains a crucial problem for neural ne...

0 Norman Mu, et al. ∙

research

∙ 11/30/2018

On the Computational Inefficiency of Large Batch Sizes for Stochastic Gradient Descent

Increasing the mini-batch size for stochastic gradient descent offers si...

0 Noah Golmant, et al. ∙

research

∙ 11/05/2018

Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge

Gliomas are the most common primary brain malignancies, with different d...

0 Spyridon Bakas, et al. ∙

research

∙ 10/11/2018

A Novel Domain Adaptation Framework for Medical Image Segmentation

We propose a segmentation framework that uses deep neural networks and i...

0 Amir Gholami, et al. ∙

research

∙ 10/02/2018

Large batch size training of neural networks with adversarial training and second-order information

Stochastic Gradient Descent (SGD) methods using randomly selected batche...

2 Zhewei Yao, et al. ∙

research

∙ 08/13/2018

CLAIRE: A distributed-memory solver for constrained large deformation diffeomorphic image registration

We introduce CLAIRE, a distributed-memory algorithm and software for sol...

0 Andreas Mang, et al. ∙

research

∙ 04/20/2018

Co-Design of Deep Neural Nets and Neural Net Accelerators for Embedded Vision Applications

Deep Learning is arguably the most rapidly evolving research area in rec...

0 Kiseok Kwon, et al. ∙

research

∙ 03/23/2018

SqueezeNext: Hardware-Aware Neural Network Design

One of the main barriers for deploying neural networks on embedded syste...

0 Amir Gholami, et al. ∙

research

∙ 02/28/2018

PDE-constrained optimization in medical image analysis

PDE-constrained optimization problems find many applications in medical ...

0 Andreas Mang, et al. ∙

research

∙ 02/22/2018

Hessian-based Analysis of Large Batch Training and Robustness to Adversaries

Large batch size training of Neural Networks has been shown to incur acc...

0 Zhewei Yao, et al. ∙

research

∙ 12/12/2017

Integrated Model, Batch and Domain Parallelism in Training Neural Networks

We propose a new integrated method of exploiting model, batch and domain...

0 Amir Gholami, et al. ∙

research

∙ 12/12/2017

Integrated Model and Data Parallelism in Training Neural Networks

We propose a new integrated method of exploiting both model and data par...

0 Amir Gholami, et al. ∙

research

∙ 08/11/2016

Distributed-memory large deformation diffeomorphic 3D image registration

We present a parallel distributed-memory algorithm for large deformation...

0 Andreas Mang, et al. ∙

Amir Gholami

Featured Co-authors

Sign in with Google

Consider DeepAI Pro