
Dangers of Bayesian Model Averaging under Covariate Shift
Approximate Bayesian inference for neural networks is considered a robus...
read it

Does Knowledge Distillation Really Work?
Knowledge distillation is a popular technique for training a small stude...
read it

What Are Bayesian Neural Network Posteriors Really Like?
The posterior over Bayesian neural network (BNN) parameters is extremely...
read it

Learning Invariances in Neural Networks
Invariances to translations have imbued convolutional neural networks wi...
read it

Why Normalizing Flows Fail to Detect OutofDistribution Data
Detecting outofdistribution (OOD) data is crucial for robust machine l...
read it

Generalizing Convolutional Neural Networks for Equivariance to Lie Groups on Arbitrary Continuous Data
The translation equivariance of convolutional layers enables convolution...
read it

Bayesian Deep Learning and a Probabilistic Perspective of Generalization
The key distinguishing property of a Bayesian approach is marginalizatio...
read it

SemiSupervised Learning with Normalizing Flows
Normalizing flows transform a latent distribution through an invertible ...
read it

Subspace Inference for Bayesian Deep Learning
Bayesian inference was once a gold standard for learning with neural net...
read it

A Simple Baseline for Bayesian Uncertainty in Deep Learning
We propose SWAGaussian (SWAG), a simple, scalable, and general purpose ...
read it

Improving ConsistencyBased SemiSupervised Learning with Weight Averaging
Recent advances in deep unsupervised learning have renewed interest in s...
read it

Averaging Weights Leads to Wider Optima and Better Generalization
Deep neural networks are typically trained by optimizing a loss function...
read it

Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs
The loss functions of deep neural networks are complex and their geometr...
read it

Tensor Train decomposition on TensorFlow (T3F)
Tensor Train decomposition is used across many branches of machine learn...
read it

Scalable Gaussian Processes with Billions of Inducing Inputs via Tensor Train Decomposition
We propose a method (TTGP) for approximate inference in Gaussian Proces...
read it

Faster variational inducing input Gaussian process classification
Gaussian processes (GP) provide a prior over functions and allow finding...
read it
Pavel Izmailov
is this you? claim profile