b'Dan Alistarh'

research

∙ 09/15/2023

Scaling Laws for Sparsely-Connected Foundation Models

We explore the impact of parameter sparsity on the scaling behavior of T...

0 Elias Frantar, et al. ∙

research

∙ 08/03/2023

Accurate Neural Network Pruning Requires Rethinking Sparse Optimization

Obtaining versions of deep neural networks that are both highly-accurate...

0 Denis Kuznedelev, et al. ∙

research

∙ 07/14/2023

Repeated Game Dynamics in Population Protocols

We initiate the study of repeated game dynamics in the population model,...

0 Dan Alistarh, et al. ∙

research

∙ 07/07/2023

QIGen: Generating Efficient Kernels for Quantized Inference on Large Language Models

We present ongoing work on a new automatic code generation approach for ...

0 Tommaso Pegolotti, et al. ∙

research

∙ 06/14/2023

Decentralized Learning Dynamics in the Gossip Model

We study a distributed multi-armed bandit setting among a population of ...

0 John Lazarsfeld, et al. ∙

research

∙ 06/09/2023

Error Feedback Can Accurately Compress Preconditioners

Leveraging second-order information at the scale of deep networks is one...

0 Ionuţ-Vlad Modoranu, et al. ∙

research

∙ 06/05/2023

SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression

Recent advances in large language model (LLM) pretraining have led to hi...

0 Tim Dettmers, et al. ∙

research

∙ 05/27/2023

Knowledge Distillation Performs Partial Variance Reduction

Knowledge distillation is a popular approach for enhancing the performan...

0 Mher Safaryan, et al. ∙

research

∙ 04/25/2023

Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures

Pruning - that is, setting a significant subset of the parameters of a n...

0 Eugenia Iofinova, et al. ∙

research

∙ 04/18/2023

Provably-Efficient and Internally-Deterministic Parallel Union-Find

Determining the degree of inherent parallelism in classical sequential a...

0 Alexander Fedorov, et al. ∙

research

∙ 03/25/2023

Vision Models Can Be Efficiently Specialized via Few-Shot Task-Aware Compression

Recent vision architectures and self-supervised training methods enable ...

3 Denis Kuznedelev, et al. ∙

research

∙ 02/09/2023

SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks

We provide a new efficient version of the backpropagation algorithm, spe...

0 Mahdi Nikdan, et al. ∙

research

∙ 02/07/2023

ZipLM: Hardware-Aware Structured Pruning of Language Models

The breakthrough performance of large language models (LLMs) comes with ...

0 Eldar Kurtic, et al. ∙

research

∙ 02/05/2023

Quantized Distributed Training of Large Models with Convergence Guarantees

Communication-reduction techniques are a popular way to improve scalabil...

0 Ilia Markov, et al. ∙

research

∙ 01/02/2023

Massive Language Models Can Be Accurately Pruned in One-Shot

We show for the first time that large-scale generative pretrained transf...

0 Elias Frantar, et al. ∙

research

∙ 12/19/2022

PathCAS: An Efficient Middle Ground for Concurrent Search Data Structures

To maximize the performance of concurrent data structures, researchers h...

0 Trevor Brown, et al. ∙

research

∙ 11/09/2022

Fast and Scalable Channels in Kotlin Coroutines

Asynchronous programming has gained significant popularity over the last...

0 Nikita Koval, et al. ∙

research

∙ 10/31/2022

L-GreCo: An Efficient and General Framework for Layerwise-Adaptive Gradient Compression

Data-parallel distributed training of deep neural networks (DNN) has gai...

0 Mohammadreza Alimohammadi, et al. ∙

research

∙ 10/31/2022

GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers

Generative Pre-trained Transformer (GPT) models set themselves apart thr...

0 Elias Frantar, et al. ∙

research

∙ 10/14/2022

oViT: An Accurate Second-Order Pruning Framework for Vision Transformers

Models from the Vision Transformer (ViT) family have recently provided b...

0 Denis Kuznedelev, et al. ∙

research

∙ 10/14/2022

Hybrid Decentralized Optimization: First- and Zeroth-Order Optimizers Can Be Jointly Leveraged For Faster Convergence

Distributed optimization has become one of the standard ways of speeding...

0 Shayan Talaei, et al. ∙

research

∙ 10/12/2022

GMP*: Well-Tuned Global Magnitude Pruning Can Outperform Most BERT-Pruning Methods

We revisit the performance of the classic gradual magnitude pruning (GMP...

0 Eldar Kurtic, et al. ∙

research

∙ 08/24/2022

Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning

We consider the problem of model compression for deep neural networks (D...

0 Elias Frantar, et al. ∙

research

∙ 07/28/2022

CrAM: A Compression-Aware Minimizer

We examine the question of whether SGD-based optimization of deep neural...

0 Alexandra Peşte, et al. ∙

research

∙ 06/20/2022

QuAFL: Federated Averaging Can Be Both Asynchronous and Communication-Efficient

Federated Learning (FL) is an emerging paradigm to enable the large-scal...

0 Hossein Zakerinia, et al. ∙

research

∙ 05/25/2022

Near-Optimal Leader Election in Population Protocols on Graphs

In the stochastic population protocol model, we are given a connected gr...

0 Dan Alistarh, et al. ∙

research

∙ 03/14/2022

The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models

Pre-trained Transformer-based language models have become a key building...

0 Eldar Kurtic, et al. ∙

research

∙ 03/13/2022

Scaling the Wild: Decentralizing Hogwild!-style Shared-memory SGD

Powered by the simplicity of lock-free asynchrony, Hogwilld! is a go-to ...

1 Bapi Chatterjee, et al. ∙

research

∙ 01/31/2022

SPDY: Accurate Pruning with Speedup Guarantees

The recent focus on the efficiency of deep neural networks (DNNs) has le...

0 Elias Frantar, et al. ∙

research

∙ 12/10/2021

Collecting Coupons is Faster with Friends

In this note, we introduce a distributed twist on the classic coupon col...

0 Dan Alistarh, et al. ∙

research

∙ 11/26/2021

How Well Do Sparse Imagenet Models Transfer?

Transfer learning is a classic paradigm by which models pretrained on la...

0 Eugenia Iofinova, et al. ∙

research

∙ 11/22/2021

A Formally-Verified Framework for Fair Synchronization in Kotlin Coroutines

Writing concurrent code that is both correct and efficient is notoriousl...

0 Nikita Koval, et al. ∙

research

∙ 11/16/2021

Project CGX: Scalable Deep Learning on Commodity GPUs

The ability to scale out training workloads has been one of the key perf...

0 Ilia Markov, et al. ∙

research

∙ 10/27/2021

Distributed Principal Component Analysis with Limited Communication

We study efficient distributed algorithms for the fundamental problem of...

0 Foivos Alimisis, et al. ∙

research

∙ 09/02/2021

Multi-Queues Can Be State-of-the-Art Priority Schedulers

Designing and implementing efficient parallel priority schedulers is an ...

0 Anastasiia Postnikova, et al. ∙

research

∙ 08/05/2021

Lower Bounds for Shared-Memory Leader Election under Bounded Write Contention

This paper gives tight logarithmic lower bounds on the solo step complex...

0 Dan Alistarh, et al. ∙

research

∙ 07/08/2021

SSSE: Efficiently Erasing Samples from Trained Machine Learning Models

The availability of large amounts of user-provided data has been key to ...

0 Alexandra Peşte, et al. ∙

research

∙ 07/07/2021

Efficient Matrix-Free Approximations of Second-Order Information, with Applications to Pruning and Optimization

Efficiently approximating local curvature information of the loss functi...

0 Elias Frantar, et al. ∙

research

∙ 06/23/2021

AC/DC: Alternating Compressed/DeCompressed Training of Deep Neural Networks

The increasing computational requirements of deep neural networks (DNNs)...

0 Alexandra Peşte, et al. ∙

research

∙ 05/17/2021

A Scalable Concurrent Algorithm for Dynamic Connectivity

Dynamic Connectivity is a fundamental algorithmic graph problem, motivat...

0 Alexander Fedorov, et al. ∙

research

∙ 04/28/2021

NUQSGD: Provably Communication-efficient Data-parallel SGD via Nonuniform Quantization

As the size and complexity of models and datasets grow, so does the need...

0 Ali Ramezani-Kebrya, et al. ∙

research

∙ 03/16/2021

Wait-free approximate agreement on graphs

Approximate agreement is one of the few variants of consensus that can b...

0 Dan Alistarh, et al. ∙

research

∙ 02/17/2021

Fast Graphical Population Protocols

Let G be a graph on n nodes. In the stochastic population protocol model...

0 Dan Alistarh, et al. ∙

research

∙ 02/14/2021

Communication-Efficient Distributed Optimization with Quantized Preconditioners

We investigate fast and communication-efficient algorithms for the class...

0 Foivos Alimisis, et al. ∙

research

∙ 01/31/2021

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

The growing energy and performance costs of deep learning have driven th...

85 Torsten Hoefler, et al. ∙

research

∙ 12/28/2020

Byzantine-Resilient Non-Convex Stochastic Gradient Descent

We study adversary-resilient stochastic distributed optimization, in whi...

0 Zeyuan Allen-Zhu, et al. ∙

research

∙ 10/23/2020

Adaptive Gradient Quantization for Data-Parallel SGD

Many communication-efficient variants of SGD use gradient quantization s...

0 Fartash Faghri, et al. ∙

research

∙ 10/16/2020

Improved Communication Lower Bounds for Distributed Optimisation

Motivated by the interest in communication-efficient methods for distrib...

0 Dan Alistarh, et al. ∙

research

∙ 08/03/2020

The Splay-List: A Distribution-Adaptive Concurrent Skip-List

The design and implementation of efficient concurrent data structures ha...

0 Vitaly Aksenov, et al. ∙

research

∙ 06/25/2020

Fast General Distributed Transactions with Opacity using Global Time

Transactions can simplify distributed applications by hiding data distri...

0 Alex Shamis, et al. ∙

Dan Alistarh

Featured Co-authors

Sign in with Google

Consider DeepAI Pro