Regularising Deep Networks with DGMs
Here we develop a new method for regularising neural networks where we learn a density estimator over the activations of all layers of the model. We extend recent work in data imputation using VAEs (Ivanov et al., 2018) so that we can obtain a posterior for an arbitrary subset of activations conditioned on the remainder. Our method has links both to dropout and to data augmentation. We demonstrate that our training method leads to lower cross-entropy test set loss for 2-hidden-layer neural networks trained on CIFAR-10 and SVHN compared to standard regularisation baselines, but our model does not improve test-set accuracy compared to our baselines. This implies that although decisions are broadly similar, our approach provides a network with better calibrated uncertainty measures over the class posteriors.
READ FULL TEXT