
SKIing on Simplices: Kernel Interpolation on the Permutohedral Lattice for Scalable Gaussian Processes
Stateoftheart methods for scalable Gaussian processes use iterative a...
read it

Scalable Variational Gaussian Processes via Harmonic Kernel Decomposition
We introduce a new scalable variational Gaussian process approximation w...
read it

Does Knowledge Distillation Really Work?
Knowledge distillation is a popular technique for training a small stude...
read it

What Are Bayesian Neural Network Posteriors Really Like?
The posterior over Bayesian neural network (BNN) parameters is extremely...
read it

A Practical Method for Constructing Equivariant Multilayer Perceptrons for Arbitrary Matrix Groups
Symmetries and equivariance are fundamental to the generalization of neu...
read it

Kernel Interpolation for Scalable Online Gaussian Processes
Gaussian processes (GPs) provide a gold standard for performance in onli...
read it

Fast Adaptation with Linearized Neural Networks
The inductive biases of trained neural networks are difficult to underst...
read it

Loss Surface Simplexes for Mode Connecting Volumes and Fast Ensembling
With a better understanding of the loss surfaces for multilayer networks...
read it

Simplifying Hamiltonian and Lagrangian Neural Networks via Explicit Constraints
Reasoning about the physical world requires models that are endowed with...
read it

Learning Invariances in Neural Networks
Invariances to translations have imbued convolutional neural networks wi...
read it

On the modelbased stochastic value gradient for continuous reinforcement learning
Modelbased reinforcement learning approaches add explicit domain knowle...
read it

Why Normalizing Flows Fail to Detect OutofDistribution Data
Detecting outofdistribution (OOD) data is crucial for robust machine l...
read it

Improving GAN Training with Probability Ratio Clipping and Sample Reweighting
Despite success on a wide range of problems related to vision, generativ...
read it

Rethinking Parameter Counting in Deep Models: Effective Dimensionality Revisited
Neural networks appear to have mysterious generalization properties when...
read it

Generalizing Convolutional Neural Networks for Equivariance to Lie Groups on Arbitrary Continuous Data
The translation equivariance of convolutional layers enables convolution...
read it

Bayesian Deep Learning and a Probabilistic Perspective of Generalization
The key distinguishing property of a Bayesian approach is marginalizatio...
read it

The Case for Bayesian Deep Learning
The key distinguishing property of a Bayesian approach is marginalizatio...
read it

SemiSupervised Learning with Normalizing Flows
Normalizing flows transform a latent distribution through an invertible ...
read it

Randomly Projected Additive Gaussian Processes for Regression
Gaussian processes (GPs) provide flexible distributions over functions, ...
read it

FunctionSpace Distributions over Kernels
Gaussian processes are flexible function approximators, with inductive b...
read it

BoTorch: Programmable Bayesian Optimization in PyTorch
Bayesian optimization provides sampleefficient global optimization for ...
read it

Subspace Inference for Bayesian Deep Learning
Bayesian inference was once a gold standard for learning with neural net...
read it

Simple Blackbox Adversarial Attacks
We propose an intriguingly simple method for the construction of adversa...
read it

SWALP : Stochastic Weight Averaging in LowPrecision Training
Low precision operations can provide scalability, memory savings, portab...
read it

SysML: The New Frontier of Machine Learning Systems
Machine learning (ML) techniques are enjoying rapidly increasing adoptio...
read it

Exact Gaussian Processes on a Million Data Points
Gaussian processes (GPs) are flexible models with stateoftheart perfo...
read it

Practical Multifidelity Bayesian Optimization for Hyperparameter Tuning
Bayesian optimization is popular for optimizing timeconsuming blackbox...
read it

Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning
The posteriors over neural network weights are high dimensional and mult...
read it

A Simple Baseline for Bayesian Uncertainty in Deep Learning
We propose SWAGaussian (SWAG), a simple, scalable, and general purpose ...
read it

Scaling Gaussian Process Regression with Derivatives
Gaussian processes (GPs) with derivatives are useful in many application...
read it

Change Surfaces for Expressive Multidimensional Changepoints and Counterfactual Prediction
Identifying changes in model parameters is fundamental in machine learni...
read it

GPyTorch: Blackbox MatrixMatrix Gaussian Process Inference with GPU Acceleration
Despite advances in scalable models, the inference tools used for Gaussi...
read it

Improving ConsistencyBased SemiSupervised Learning with Weight Averaging
Recent advances in deep unsupervised learning have renewed interest in s...
read it

Probabilistic FastText for MultiSense Word Embeddings
We introduce Probabilistic FastText, a new model for word embeddings tha...
read it

Hierarchical Density Order Embeddings
By representing words with probability densities rather than point vecto...
read it

Gaussian Process Subset Scanning for Anomalous Pattern Detection in Noniid Data
Identifying anomalous patterns in realworld data is essential for under...
read it

ConstantTime Predictive Distributions for Gaussian Processes
One of the most compelling features of Gaussian process (GP) regression ...
read it

Averaging Weights Leads to Wider Optima and Better Generalization
Deep neural networks are typically trained by optimizing a loss function...
read it

Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs
The loss functions of deep neural networks are complex and their geometr...
read it

Product Kernel Interpolation for Scalable Gaussian Processes
Recent work shows that inference for Gaussian processes can be performed...
read it

Scalable Lévy Process Priors for Spectral Kernel Learning
Gaussian processes are rich distributions over functions, with generaliz...
read it

Proceedings of NIPS 2017 Symposium on Interpretable Machine Learning
This is the Proceedings of NIPS 2017 Symposium on Interpretable Machine ...
read it

Scalable Log Determinants for Gaussian Process Kernel Learning
For applications as varied as Bayesian neural networks, determinantal po...
read it

Bayesian GAN
Generative adversarial networks (GANs) can implicitly learn rich distrib...
read it

Multimodal Word Distributions
Word embeddings provide point representations of words containing useful...
read it

Bayesian Optimization with Gradients
Bayesian optimization has been successful at global optimization of expe...
read it

Stochastic Variational Deep Kernel Learning
Deep kernel learning combines the nonparametric flexibility of kernel m...
read it

Learning Scalable Deep Kernels with Recurrent Structure
Many applications in speech, robotics, finance, and biology deal with se...
read it

Deep Kernel Learning
We introduce scalable deep kernels, which combine the structural propert...
read it

Thoughts on Massively Scalable Gaussian Processes
We introduce a framework and early results for massively scalable Gaussi...
read it