Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs

02/27/2018
by   Timur Garipov, et al.
0

The loss functions of deep neural networks are complex and their geometric properties are not well understood. We show that the optima of these complex loss functions are in fact connected by a simple polygonal chain with only one bend, over which training and test accuracy are nearly constant. We introduce a training procedure to discover these high-accuracy pathways between modes. Inspired by this new geometric insight, we propose a new ensembling method entitled Fast Geometric Ensembling (FGE). Using FGE we can train high-performing ensembles in the time required to train a single model. We achieve improved performance compared to the recent state-of-the-art Snapshot Ensembles, on CIFAR-10 and CIFAR-100, using state-of-the-art deep residual networks. On ImageNet we improve the top-1 error-rate of a pre-trained ResNet by 0.56

READ FULL TEXT

page 2

page 12

research
03/14/2018

Averaging Weights Leads to Wider Optima and Better Generalization

Deep neural networks are typically trained by optimizing a loss function...
research
02/25/2021

Loss Surface Simplexes for Mode Connecting Volumes and Fast Ensembling

With a better understanding of the loss surfaces for multilayer networks...
research
02/14/2022

PFGE: Parsimonious Fast Geometric Ensembling of DNNs

Ensemble methods have been widely used to improve the performance of mac...
research
06/24/2022

Out of distribution robustness with pre-trained Bayesian neural networks

We develop ShiftMatch, a new training-data-dependent likelihood for out ...
research
05/30/2019

Leveraging Simple Model Predictions for Enhancing its Performance

There has been recent interest in improving performance of simple models...
research
02/12/2019

Joint Training of Neural Network Ensembles

We examine the practice of joint training for neural network ensembles, ...
research
01/22/2017

Optimization on Product Submanifolds of Convolution Kernels

Recent advances in optimization methods used for training convolutional ...

Please sign up or login with your details

Forgot password? Click here to reset