A major challenge to out-of-distribution generalization is reliance on
s...
Vision Transformers convert images to sequences by slicing them into pat...
Deep classifiers are known to rely on spurious features x2013
patterns w...
Neural network classifiers can largely rely on simple spurious features,...
Aleatoric uncertainty captures the inherent randomness of the data, such...
How do we compare between hypotheses that are entirely consistent with
o...
Approximate Bayesian inference for neural networks is considered a robus...
Knowledge distillation is a popular technique for training a small stude...
The posterior over Bayesian neural network (BNN) parameters is extremely...
Invariances to translations have imbued convolutional neural networks wi...
Detecting out-of-distribution (OOD) data is crucial for robust machine
l...
The translation equivariance of convolutional layers enables convolution...
The key distinguishing property of a Bayesian approach is marginalizatio...
Normalizing flows transform a latent distribution through an invertible
...
Bayesian inference was once a gold standard for learning with neural
net...
We propose SWA-Gaussian (SWAG), a simple, scalable, and general purpose
...
Recent advances in deep unsupervised learning have renewed interest in
s...
Deep neural networks are typically trained by optimizing a loss function...
The loss functions of deep neural networks are complex and their geometr...
Tensor Train decomposition is used across many branches of machine learn...
We propose a method (TT-GP) for approximate inference in Gaussian Proces...
Gaussian processes (GP) provide a prior over functions and allow finding...