PEP: Parameter Ensembling by Perturbation

10/24/2020
by   Alireza Mehrtash, et al.
0

Ensembling is now recognized as an effective approach for increasing the predictive performance and calibration of deep networks. We introduce a new approach, Parameter Ensembling by Perturbation (PEP), that constructs an ensemble of parameter values as random perturbations of the optimal parameter set from training by a Gaussian with a single variance parameter. The variance is chosen to maximize the log-likelihood of the ensemble average (𝕃) on the validation data set. Empirically, and perhaps surprisingly, 𝕃 has a well-defined maximum as the variance grows from zero (which corresponds to the baseline model). Conveniently, calibration level of predictions also tends to grow favorably until the peak of 𝕃 is reached. In most experiments, PEP provides a small improvement in performance, and, in some cases, a substantial improvement in empirical calibration. We show that this "PEP effect" (the gain in log-likelihood) is related to the mean curvature of the likelihood function and the empirical Fisher information. Experiments on ImageNet pre-trained networks including ResNet, DenseNet, and Inception showed improved calibration and likelihood. We further observed a mild improvement in classification accuracy on these networks. Experiments on classification benchmarks such as MNIST and CIFAR-10 showed improved calibration and likelihood, as well as the relationship between the PEP effect and overfitting; this demonstrates that PEP can be used to probe the level of overfitting that occurred during training. In general, no special training procedure or network architecture is needed, and in the case of pre-trained networks, no additional training is needed.

READ FULL TEXT
research
03/09/2023

Adaptive Calibrator Ensemble for Model Calibration under Distribution Shift

Model calibration usually requires optimizing some parameters (e.g., tem...
research
01/12/2020

Unbiased and Efficient Log-Likelihood Estimation with Inverse Binomial Sampling

The fate of scientific hypotheses often relies on the ability of a compu...
research
02/13/2023

Bag of Tricks for In-Distribution Calibration of Pretrained Transformers

While pre-trained language models (PLMs) have become a de-facto standard...
research
05/27/2019

On Mixup Training: Improved Calibration and Predictive Uncertainty for Deep Neural Networks

Mixup zhang2017mixup is a recently proposed method for training deep neu...
research
02/05/2021

Multi-Sample Online Learning for Spiking Neural Networks based on Generalized Expectation Maximization

Spiking Neural Networks (SNNs) offer a novel computational paradigm that...
research
10/13/2020

Training independent subnetworks for robust prediction

Recent approaches to efficiently ensemble neural networks have shown tha...
research
08/13/2021

Self-Calibrating the Look-Elsewhere Effect: Fast Evaluation of the Statistical Significance Using Peak Heights

In experiments where one searches a large parameter space for an anomaly...

Please sign up or login with your details

Forgot password? Click here to reset