
On the Periodic Behavior of Neural Network Training with Batch Normalization and Weight Decay
Despite the conventional wisdom that using batch normalization with weig...
read it

Mean Embeddings with TestTime Data Augmentation for Ensembling of Representations
Averaging predictions over a set of models – an ensemble – is widely use...
read it

Towards Practical Credit Assignment for Deep Reinforcement Learning
Credit assignment is a fundamental problem in reinforcement learning, th...
read it

On Power Laws in Deep Ensembles
Ensembles of deep neural networks are known to achieve stateoftheart ...
read it

Involutive MCMC: a Unifying Framework
Markov Chain Monte Carlo (MCMC) is a computational approach to fundament...
read it

MARS: Masked Automatic Ranks Selection in Tensor Decompositions
Tensor decomposition methods have recently proven to be efficient for co...
read it

Reintroducing StraightThrough Estimators as Principled Methods for Stochastic Binary Networks
Training neural networks with binary weights and activations is a challe...
read it

Deep Ensembles on a Fixed Memory Budget: One Wide Network or Several Thinner Ones?
One of the generally accepted views of modern deep learning is that incr...
read it

Controlling Overestimation Bias with Truncated Mixture of Continuous Distributional Quantile Critics
The overestimation bias is one of the major impediments to accurate off...
read it

Deterministic Decoding for Discrete Data in Variational Autoencoders
Variational autoencoders are prominent generative models for modeling di...
read it

Stochasticity in Neural ODEs: An Empirical Study
Stochastic regularization of neural networks (e.g. dropout) is a widesp...
read it

Greedy Policy Search: A Simple Baseline for Learnable TestTime Augmentation
Testtime data augmentation—averaging the predictions of a machine learn...
read it

Pitfalls of InDomain Uncertainty Estimation and Ensembling in Deep Learning
Uncertainty estimation and ensembling methods go handinhand. Uncertain...
read it

MLRG Deep Curvature
We present MLRG Deep Curvature suite, a PyTorchbased, opensource packa...
read it

Lowvariance Blackbox Gradient Estimates for the PlackettLuce Distribution
Learning models with discrete latent variables using stochastic gradient...
read it

Structured Sparsification of Gated Recurrent Neural Networks
Recently, a lot of techniques were developed to sparsify the weights of ...
read it

A Prior of a Googol Gaussians: a Tensor Ring Induced Prior for Generative Models
Generative models produce realistic objects in many domains, including t...
read it

Subspace Inference for Bayesian Deep Learning
Bayesian inference was once a gold standard for learning with neural net...
read it

The Implicit MetropolisHastings Algorithm
Recent works propose using the discriminator of a GAN to filter out unre...
read it

Importance Weighted Hierarchical Variational Inference
Variational Inference is a powerful tool in the Bayesian modeling toolki...
read it

SemiConditional Normalizing Flows for SemiSupervised Learning
This paper proposes a semiconditional normalizing flow model for semis...
read it

UserControllable MultiTexture Synthesis with Generative Adversarial Networks
We propose a novel multitexture synthesis model based on generative adv...
read it

A Simple Baseline for Bayesian Uncertainty in Deep Learning
We propose SWAGaussian (SWAG), a simple, scalable, and general purpose ...
read it

Bayesian Sparsification of Gated Recurrent Neural Networks
Bayesian methods have been successfully applied to sparsify weights of n...
read it

ReSet: Learning Recurrent Dynamic Routing in ResNetlike Neural Networks
Neural Network is a powerful Machine Learning tool that shows outstandin...
read it

Variational Dropout via Empirical Bayes
We study the Automatic Relevance Determination procedure applied to deep...
read it

Bayesian Compression for Natural Language Processing
In natural language processing, a lot of the tasks are successfully solv...
read it

MetropolisHastings view on variational inference and adversarial training
In this paper we propose to view the acceptance rate of the MetropolisH...
read it

The Deep Weight Prior. Modeling a prior distribution for CNNs using generative models
Bayesian inference is known to provide a general framework for incorpora...
read it

Pairwise Augmented GANs with Adversarial Reconstruction Loss
We propose a novel autoencoding model called Pairwise Augmented GANs. We...
read it

Doubly SemiImplicit Variational Inference
We extend the existing framework of semiimplicit variational inference ...
read it

Conditional Generators of Words Definitions
We explore recently introduced definition modeling technique that provid...
read it

Universal Conditional Machine
We propose a single neural probabilistic model based on variational auto...
read it

Averaging Weights Leads to Wider Optima and Better Generalization
Deep neural networks are typically trained by optimizing a loss function...
read it

Bayesian Incremental Learning for Deep Neural Networks
In industrial machine learning pipelines, data often arrive in parts. Pa...
read it

Uncertainty Estimation via Stochastic Batch Normalization
In this work, we investigate Batch Normalization technique and propose i...
read it

Probabilistic Adaptive Computation Time
We present a probabilistic model with discrete latent variables that con...
read it

Bayesian Sparsification of Recurrent Neural Networks
Recurrent neural networks show stateoftheart results in many text ana...
read it

Structured Bayesian Pruning via LogNormal Multiplicative Noise
Dropoutbased regularization methods can be regarded as injecting random...
read it

Variational Dropout Sparsifies Deep Neural Networks
We explore a recently proposed Variational Dropout technique that provid...
read it

Spatially Adaptive Computation Time for Residual Networks
This paper proposes a deep learning architecture based on Residual Netwo...
read it

GTApprox: surrogate modeling for industrial design
We describe GTApprox  a new tool for mediumscale surrogate modeling in...
read it

Tensorizing Neural Networks
Deep neural networks currently demonstrate stateoftheart performance ...
read it

PerforatedCNNs: Acceleration through Elimination of Redundant Convolutions
We propose a novel approach to reduce the computational cost of evaluati...
read it

Breaking Sticks and Ambiguities with Adaptive Skipgram
Recently proposed Skipgram model is a powerful method for learning high...
read it

Submodular relaxation for inference in Markov random fields
In this paper we address the problem of finding the most probable state ...
read it

Multiutility Learning: Structuredoutput Learning with Multiple Annotationspecific Loss Functions
Structuredoutput learning is a challenging problem; particularly so bec...
read it

Submodular Decomposition Framework for Inference in Associative Markov Networks with Global Constraints
In the paper we address the problem of finding the most probable state o...
read it
Dmitry Vetrov
is this you? claim profile
Research Professor, Head of the Centre:Faculty of Computer Science, Laboratory Head:Faculty of Computer Science at Higher School of Economics , Leading researcher at Yandex