Tijmen Blankevoort

research

∙ 08/14/2023

Efficient Neural PDE-Solvers using Quantization Aware Training

In the past years, the application of neural networks as an alternative ...

0 Winfried van den Dool, et al. ∙

research

∙ 07/10/2023

QBitOpt: Fast and Accurate Bitwidth Reallocation during Training

Quantizing neural networks is one of the most effective methods for achi...

0 Jorn Peters, et al. ∙

research

∙ 07/06/2023

Pruning vs Quantization: Which is Better?

Neural network pruning and quantization techniques are almost as old as ...

0 Andrey Kuzmin, et al. ∙

research

∙ 07/05/2023

MSViT: Dynamic Mixed-Scale Tokenization for Vision Transformers

The input tokens to Vision Transformers carry little semantic meaning as...

0 Jakob Drachmann Havtorn, et al. ∙

research

∙ 06/22/2023

Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing

Transformer models have been widely adopted in various domains over the ...

0 Yelysei Bondarenko, et al. ∙

research

∙ 04/11/2023

Revisiting Single-gated Mixtures of Experts

Mixture of Experts (MoE) are rising in popularity as a means to train ex...

2 Amélie Royer, et al. ∙

research

∙ 03/31/2023

FP8 versus INT8 for efficient deep learning inference

Recently, the idea of using FP8 as a number format for neural network tr...

5 Mart van Baalen, et al. ∙

research

∙ 02/10/2023

A Practical Mixed Precision Algorithm for Post-Training Quantization

Neural network quantization is frequently used to optimize model size, l...

1 Nilesh Prasad Pandey, et al. ∙

research

∙ 08/19/2022

FP8 Quantization: The Power of the Exponent

When quantizing neural networks for efficient inference, low-bit integer...

4 Andrey Kuzmin, et al. ∙

research

∙ 06/16/2022

Simple and Efficient Architectures for Semantic Segmentation

Though the state-of-the architectures for semantic segmentation, such as...

19 Dushyant Mehta, et al. ∙

research

∙ 03/21/2022

Overcoming Oscillations in Quantization-Aware Training

When training neural networks with simulated quantization, we observe th...

4 Markus Nagel, et al. ∙

research

∙ 02/02/2022

Cyclical Pruning for Sparse Neural Networks

Current methods for pruning neural network weights iteratively apply mag...

12 Suraj Srinivas, et al. ∙

research

∙ 01/20/2022

Neural Network Quantization with AI Model Efficiency Toolkit (AIMET)

While neural networks have advanced the frontiers in many machine learni...

43 Sangeetha Siddegowda, et al. ∙

research

∙ 09/27/2021

Understanding and Overcoming the Challenges of Efficient Transformer Quantization

Transformer-based architectures have become the de-facto standard models...

3 Yelysei Bondarenko, et al. ∙

research

∙ 06/15/2021

A White Paper on Neural Network Quantization

While neural networks have advanced the frontiers in many applications, ...

12 Markus Nagel, et al. ∙

research

∙ 12/16/2020

Distilling Optimal Neural Networks: Rapid Search in Diverse Spaces

This work presents DONNA (Distilling Optimal Neural Network Architecture...

10 Bert Moons, et al. ∙

research

∙ 07/20/2020

Differentiable Joint Pruning and Quantization for Hardware Efficiency

We present a differentiable joint pruning and quantization (DJPQ) scheme...

0 Ying Wang, et al. ∙

research

∙ 05/14/2020

Bayesian Bits: Unifying Quantization and Pruning

We introduce Bayesian Bits, a practical method for joint mixed precision...

4 Mart van Baalen, et al. ∙

research

∙ 04/22/2020

Up or Down? Adaptive Rounding for Post-Training Quantization

When quantizing neural networks, assigning each floating-point weight to...

1 Markus Nagel, et al. ∙

research

∙ 04/20/2020

LSQ+: Improving low-bit quantization through learnable offsets and better initialization

Unlike ReLU, newer activation functions (like Swish, H-swish, Mish) that...

5 Yash Bhalgat, et al. ∙

research

∙ 03/31/2020

Conditional Channel Gated Networks for Task-Aware Continual Learning

Convolutional Neural Networks experience catastrophic forgetting when op...

10 Davide Abati, et al. ∙

research

∙ 02/28/2020

Learned Threshold Pruning

This paper presents a novel differentiable method for unstructured weigh...

6 Kambiz Azarian, et al. ∙

research

∙ 02/18/2020

Gradient ℓ_1 Regularization for Quantization Robustness

We analyze the effect of quantizing weights and activations of neural ne...

14 Milad Alizadeh, et al. ∙

research

∙ 12/20/2019

Taxonomy and Evaluation of Structured Compression of Convolutional Neural Networks

The success of deep neural networks in many real-world applications is l...

47 Andrey Kuzmin, et al. ∙

research

∙ 07/15/2019

Batch-Shaped Channel Gated Networks

We present a method for gating deep-learning architectures on a fine-gra...

1 Babak Ehteshami Bejnordi, et al. ∙

research

∙ 06/11/2019

Data-Free Quantization through Weight Equalization and Bias Correction

We introduce a data-free quantization method for deep neural networks th...

1 Markus Nagel, et al. ∙

research

∙ 10/03/2018

Relaxed Quantization for Discretized Neural Networks

Neural network quantization has become an important research area due to...

2 Christos Louizos, et al. ∙

Tijmen Blankevoort

Featured Co-authors

Sign in with Google

Consider DeepAI Pro