
Galaxy Zoo: Probabilistic Morphology through Bayesian CNNs and Active Learning
We use Bayesian convolutional neural networks and a novel generative model of Galaxy Zoo volunteer responses to infer posteriors for the visual morphology of galaxies. Bayesian CNN can learn from galaxy images with uncertain labels and then, for previously unlabelled galaxies, predict the probability of each possible label. Our posteriors are wellcalibrated (e.g. for predicting bars, we achieve coverage errors of 10.6 responses) and hence are reliable for practical use. Further, using our posteriors, we apply the active learning strategy BALD to request volunteer responses for the subset of galaxies which, if labelled, would be most informative for training our network. We show that training our Bayesian CNNs using active learning requires up to 3560 on the morphological feature being classified. By combining human and machine intelligence, Galaxy Zoo will be able to classify surveys of any conceivable scale on a timescale of weeks, providing massive and detailed morphology catalogues to support research into galaxy evolution.
05/17/2019 ∙ by Mike Walmsley, et al. ∙ 32 ∙ shareread it

Learning Sparse Networks Using Targeted Dropout
Neural networks are easier to optimise when they have many more weights than are required for modelling the mapping from inputs to outputs. This suggests a twostage learning procedure that first learns a large net and then prunes away connections or hidden units. But standard training does not necessarily encourage nets to be amenable to pruning. We introduce targeted dropout, a method for training a neural network so that it is robust to subsequent pruning. Before computing the gradients for each weight update, targeted dropout stochastically selects a set of units or weights to be dropped using a simple selfreinforcing sparsity criterion and then computes the gradients for the remaining weights. The resulting network is robust to post hoc pruning of weights or units that frequently occur in the dropped sets. The method improves upon more complicated sparsifying regularisers while being simple to implement and easy to tune.
05/31/2019 ∙ by Aidan N. Gomez, et al. ∙ 26 ∙ shareread it

Evaluating Bayesian Deep Learning Methods for Semantic Segmentation
Deep learning has been revolutionary for computer vision and semantic segmentation in particular, with Bayesian Deep Learning (BDL) used to obtain uncertainty maps from deep models when predicting semantic classes. This information is critical when using semantic segmentation for autonomous driving for example. Standard semantic segmentation systems have wellestablished evaluation metrics. However, with BDL's rising popularity in computer vision we require new metrics to evaluate whether a BDL method produces better uncertainty estimates than another method. In this work we propose three such metrics to evaluate BDL models designed specifically for the task of semantic segmentation. We modify DeepLabv3+, one of the stateoftheart deep neural networks, and create its Bayesian counterpart using MC dropout and Concrete dropout as inference techniques. We then compare and test these two inference techniques on the wellknown Cityscapes dataset using our suggested metrics. Our results provide new benchmarks for researchers to compare and evaluate their improved uncertainty quantification in pursuit of safer semantic segmentation.
11/30/2018 ∙ by Jishnu Mukhoti, et al. ∙ 22 ∙ shareread it

On the Importance of Strong Baselines in Bayesian Deep Learning
Like all subfields of machine learning, Bayesian Deep Learning is driven by empirical validation of its theoretical proposals. Given the many aspects of an experiment, it is always possible that minor or even major experimental flaws can slip by both authors and reviewers. One of the most popular experiments used to evaluate approximate inference techniques is the regression experiment on UCI datasets. However, in this experiment, models which have been trained to convergence have often been compared with baselines trained only for a fixed number of iterations. What we find is that if we take a wellestablished baseline and evaluate it under the same experimental settings, it shows significant improvements in performance. In fact, it outperforms or performs competitively with numerous to several methods that when they were introduced claimed to be superior to the very same baseline method. Hence, by exposing this flaw in experimental procedure, we highlight the importance of using identical experimental setups to evaluate, compare and benchmark methods in Bayesian Deep Learning.
11/23/2018 ∙ by Jishnu Mukhoti, et al. ∙ 10 ∙ shareread it

A Unifying Bayesian View of Continual Learning
Some machine learning applications require continual learning  where data comes in a sequence of datasets, each is used for training and then permanently discarded. From a Bayesian perspective, continual learning seems straightforward: Given the model posterior one would simply use this as the prior for the next task. However, exact posterior evaluation is intractable with many models, especially with Bayesian neural networks (BNNs). Instead, posterior approximations are often sought. Unfortunately, when posterior approximations are used, priorfocused approaches do not succeed in evaluations designed to capture properties of realistic continual learning use cases. As an alternative to priorfocused methods, we introduce a new approximate Bayesian derivation of the continual learning loss. Our loss does not rely on the posterior from earlier tasks, and instead adapts the model itself by changing the likelihood term. We call these approaches likelihoodfocused. We then combine prior and likelihoodfocused methods into one objective, tying the two views together under a single unifying framework of approximate Bayesian continual learning.
02/18/2019 ∙ by Sebastian Farquhar, et al. ∙ 10 ∙ shareread it

Evaluating Uncertainty Quantification in EndtoEnd Autonomous Driving Control
A rise in popularity of Deep Neural Networks (DNNs), attributed to more powerful GPUs and widely available datasets, has seen them being increasingly used within safetycritical domains. One such domain, selfdriving, has benefited from significant performance improvements, with millions of miles having been driven with no human intervention. Despite this, crashes and erroneous behaviours still occur, in part due to the complexity of verifying the correctness of DNNs and a lack of safety guarantees. In this paper, we demonstrate how quantitative measures of uncertainty can be extracted in realtime, and their quality evaluated in endtoend controllers for selfdriving cars. To this end we utilise a recent method for gathering approximate uncertainty information from DNNs without changing the network's architecture. We propose evaluation techniques for the uncertainty on two separate architectures which use the uncertainty to predict crashes up to five seconds in advance. We find that mutual information, a measure of uncertainty in classification networks, is a promising indicator of forthcoming crashes.
11/16/2018 ∙ by Rhiannon Michelmore, et al. ∙ 8 ∙ shareread it

Radial Bayesian Neural Networks: Robust Variational Inference In Big Models
We propose Radial Bayesian Neural Networks: a variational distribution for mean field variational inference (MFVI) in Bayesian neural networks that is simple to implement, scalable to large models, and robust to hyperparameter selection. We hypothesize that standard MFVI fails in large models because of a property of the highdimensional Gaussians used as posteriors. As variances grow, samples come almost entirely from a `soapbubble' far from the mean. We show that the adhoc tweaks used previously in the literature to get MFVI to work served to stop such variances growing. Designing a new posterior distribution, we avoid this pathology in a theoretically principled way. Our distribution improves accuracy and uncertainty over standard MFVI, while scaling to large data where most other VI and MCMC methods struggle. We benchmark Radial BNNs in a realworld task of diabetic retinopathy diagnosis from fundus images, a task with 100x larger input dimensionality and model size compared to previous demonstrations of MFVI.
07/01/2019 ∙ by Sebastian Farquhar, et al. ∙ 7 ∙ shareread it

Generalizing from a few environments in safetycritical reinforcement learning
Before deploying autonomous agents in the real world, we need to be confident they will perform safely in novel situations. Ideally, we would expose agents to a very wide range of situations during training, allowing them to learn about every possible danger, but this is often impractical. This paper investigates safety and generalization from a limited number of training environments in deep reinforcement learning (RL). We find RL algorithms can fail dangerously on unseen test environments even when performing perfectly on training environments. Firstly, in a gridworld setting, we show that catastrophes can be significantly reduced with simple modifications, including ensemble model averaging and the use of a blocking classifier. In the more challenging CoinRun environment we find similar methods do not significantly reduce catastrophes. However, we do find that the uncertainty information from the ensemble is useful for predicting whether a catastrophe will occur within a few steps and hence whether human intervention should be requested.
07/02/2019 ∙ by Zachary Kenton, et al. ∙ 7 ∙ shareread it

Differentially Private Continual Learning
Catastrophic forgetting can be a significant problem for institutions that must delete historic data for privacy reasons. For example, hospitals might not be able to retain patient data permanently. But neural networks trained on recent data alone will tend to forget lessons learned on old data. We present a differentially private continual learning framework based on variational inference. We estimate the likelihood of past data given the current model using differentially private generative models of old datasets.
02/18/2019 ∙ by Sebastian Farquhar, et al. ∙ 6 ∙ shareread it

Towards Robust Evaluations of Continual Learning
Continual learning experiments used in current deep learning papers do not faithfully assess fundamental challenges of learning continually, masking weakpoints of the suggested approaches instead. We study gaps in such existing evaluations, proposing essential experimental evaluations that are more representative of continual learning's challenges, and suggest a reprioritization of research efforts in the field. We show that current approaches fail with our new evaluations and, to analyse these failures, we propose a variational loss which unifies many existing solutions to continual learning under a Bayesian framing, as either 'priorfocused' or 'likelihoodfocused'. We show that while priorfocused approaches such as EWC and VCL perform well on existing evaluations, they perform dramatically worse when compared to likelihoodfocused approaches on other simple tasks.
05/24/2018 ∙ by Sebastian Farquhar, et al. ∙ 2 ∙ shareread it

Towards Inverse Reinforcement Learning for Limit Order Book Dynamics
Multiagent learning is a promising method to simulate aggregate competitive behaviour in finance. Learning expert agents' reward functions through their external demonstrations is hence particularly relevant for subsequent design of realistic agentbased simulations. Inverse Reinforcement Learning (IRL) aims at acquiring such reward functions through inference, allowing to generalize the resulting policy to states not observed in the past. This paper investigates whether IRL can infer such rewards from agents within real financial stochastic environments: limit order books (LOB). We introduce a simple onelevel LOB, where the interactions of a number of stochastic agents and an expert trading agent are modelled as a Markov decision process. We consider two cases for the expert's reward: either a simple linear function of state features; or a complex, more realistic nonlinear function. Given the expert agent's demonstrations, we attempt to discover their strategy by modelling their latent reward function using linear and Gaussian process (GP) regressors from previous literature, and our own approach through Bayesian neural networks (BNN). While the three methods can learn the linear case, only the GPbased and our proposed BNN methods are able to discover the nonlinear reward case. Our BNN IRL algorithm outperforms the other two approaches as the number of samples increases. These results illustrate that complex behaviours, induced by nonlinear reward functions amid agentbased stochastic scenarios, can be deduced through inference, encouraging the use of inverse reinforcement learning for opponentmodelling in multiagent systems.
06/11/2019 ∙ by Jacobo RoaVicens, et al. ∙ 1 ∙ shareread it
Yarin Gal
is this you? claim profile
Associate Professor of Machine Learning at the Computer Science department at University of Oxford