
MetaLearning surrogate models for sequential decision making
Metalearning methods leverage past experience to learn datadriven inductive biases from related problems, increasing learning efficiency on new tasks. This ability renders them particularly suitable for sequential decision making with limited experience. Within this problem family, we argue for the use of such approaches in the study of modelbased approaches to Bayesian Optimisation, contextual bandits and Reinforcement Learning. We approach the problem by learning distributions over functions using Neural Processes (NPs), a recently introduced probabilistic metalearning method. This allows the treatment of model uncertainty to tackle the exploration/exploitation dilemma. We show that NPs are suitable for sequential decision making on a diverse set of domains, including adversarial task search, recommender systems and modelbased reinforcement learning.
03/28/2019 ∙ by Alexandre Galashov, et al. ∙ 22 ∙ shareread it

Exploiting Hierarchy for Learning and Transfer in KLregularized RL
As reinforcement learning agents are tasked with solving more challenging and diverse tasks, the ability to incorporate prior knowledge into the learning system and to exploit reusable structure in solution space is likely to become increasingly important. The KLregularized expected reward objective constitutes one possible tool to this end. It introduces an additional component, a default or prior behavior, which can be learned alongside the policy and as such partially transforms the reinforcement learning problem into one of behavior modelling. In this work we consider the implications of this framework in cases where both the policy and default behavior are augmented with latent variables. We discuss how the resulting hierarchical structures can be used to implement different inductive biases and how their modularity can benefit transfer. Empirically we find that they can lead to faster learning and transfer on a range of continuous control tasks.
03/18/2019 ∙ by Dhruva Tirumala, et al. ∙ 20 ∙ shareread it

On Exploration, Exploitation and Learning in Adaptive Importance Sampling
We study adaptive importance sampling (AIS) as an online learning problem and argue for the importance of the tradeoff between exploration and exploitation in this adaptation. Borrowing ideas from the bandits literature, we propose Daisee, a partitionbased AIS algorithm. We further introduce a notion of regret for AIS and show that Daisee has O(√(T)( T)^3/4) cumulative pseudoregret, where T is the number of iterations. We then extend Daisee to adaptively learn a hierarchical partitioning of the sample space for more efficient sampling and confirm the performance of both algorithms empirically.
10/31/2018 ∙ by Xiaoyu Lu, et al. ∙ 16 ∙ shareread it

Augmented Neural ODEs
We show that Neural Ordinary Differential Equations (ODEs) learn representations that preserve the topology of the input space and prove that this implies the existence of functions Neural ODEs cannot represent. To address these limitations, we introduce Augmented Neural ODEs which, in addition to being more expressive models, are empirically more stable, generalize better and have a lower computational cost than Neural ODEs.
04/02/2019 ∙ by Emilien Dupont, et al. ∙ 16 ∙ shareread it

Metalearning of Sequential Strategies
In this report we review memorybased metalearning as a tool for building sampleefficient strategies that learn from past experience to adapt to any task within a target class. Our goal is to equip the reader with the conceptual foundations of this tool for building new, scalable agents that operate on broad domains. To do so, we present basic algorithmic templates for building nearoptimal predictors and reinforcement learners which behave as if they had a probabilistic model that allowed them to efficiently exploit task structure. Furthermore, we recast memorybased metalearning within a Bayesian framework, showing that the metalearned strategies are nearoptimal because they amortize Bayesfiltered data, where the adaptation is implemented in the memory dynamics as a statemachine of sufficient statistics. Essentially, memorybased metalearning translates the hard problem of probabilistic sequential inference into a regression problem.
05/08/2019 ∙ by Pedro A. Ortega, et al. ∙ 16 ∙ shareread it

Variational Estimators for Bayesian Optimal Experimental Design
Bayesian optimal experimental design (BOED) is a principled framework for making efficient use of limited experimental resources. Unfortunately, its applicability is hampered by the difficulty of obtaining accurate estimates of the expected information gain (EIG) of an experiment. To address this, we introduce several classes of fast EIG estimators suited to the experiment design context by building on ideas from variational inference and mutual information estimation. We show theoretically and empirically that these estimators can provide significant gains in speed and accuracy over previous approaches. We demonstrate the practicality of our approach via a number of experiments, including an adaptive experiment with human participants.
03/13/2019 ∙ by Adam Foster, et al. ∙ 16 ∙ shareread it

Functional Regularisation for Continual Learning using Gaussian Processes
We introduce a novel approach for supervised continual learning based on approximate Bayesian inference over function space rather than the parameters of a deep neural network. We use a Gaussian process obtained by treating the weights of the last layer of a neural network as random and Gaussian distributed. Functional regularisation for continual learning naturally arises by applying the variational sparse GP inference method in a sequential fashion as new tasks are encountered. At each step of the process, a summary is constructed for the current task that consists of (i) inducing inputs and (ii) a posterior distribution over the function values at these inputs. This summary then regularises learning of future tasks, through KullbackLeibler regularisation terms that appear in the variational lower bound, and reduces the effects of catastrophic forgetting. We fully develop the theory of the method and we demonstrate its effectiveness in classification datasets, such as SplitMNIST, PermutedMNIST and Omniglot.
01/31/2019 ∙ by Michalis K. Titsias, et al. ∙ 14 ∙ shareread it

Neural probabilistic motor primitives for humanoid control
We focus on the problem of learning a single motor module that can flexibly express a range of behaviors for the control of highdimensional physically simulated humanoids. To do this, we propose a motor architecture that has the general structure of an inverse model with a latentvariable bottleneck. We show that it is possible to train this model entirely offline to compress thousands of expert policies and learn a motor primitive embedding space. The trained neural probabilistic motor primitive system can perform oneshot imitation of wholebody humanoid behaviors, robustly mimicking unseen trajectories. Additionally, we demonstrate that it is also straightforward to train controllers to reuse the learned motor primitive space to solve tasks, and the resulting movements are relatively naturalistic. To support the training of our model, we compare two approaches for offline policy cloning, including an experience efficient method which we call linear feedback policy cloning. We encourage readers to view a supplementary video summarizing our results ( https://youtu.be/1NAHsrrH2t0 ).
11/28/2018 ∙ by Josh Merel, et al. ∙ 12 ∙ shareread it

Hybrid Models with Deep and Invertible Features
We propose a neural hybrid model consisting of a linear model defined on a set of features computed by a deep, invertible transformation (i.e. a normalizing flow). An attractive property of our model is that both p(features), the features' density, and p(targets  features), the predictive distribution, can be computed exactly in a single feedforward pass. We show that our hybrid model, despite the invertibility constraints, achieves similar accuracy to purely predictive models. Yet the generative component remains a good model of the input features despite the hybrid optimization objective. This offers additional capabilities such as detection of outofdistribution inputs and enabling semisupervised learning. The availability of the exact joint density p(targets, features) also allows us to compute many quantities readily, making our hybrid model a useful building block for downstream applications of probabilistic deep learning.
02/07/2019 ∙ by Eric Nalisnick, et al. ∙ 12 ∙ shareread it

Set Transformer
Many machine learning tasks such as multiple instance learning, 3D shape recognition and fewshot image classification are defined on sets of instances. Since solutions to such problems do not depend on the permutation of elements of the set, models used to address them should be permutation invariant. We present an attentionbased neural network module, the Set Transformer, specifically designed to model interactions among elements in the input set. The model consists of an encoder and a decoder, both of which rely on attention mechanisms. In an effort to reduce computational complexity, we introduce an attention scheme inspired by inducing point methods from sparse Gaussian process literature. It reduces computation time of selfattention from quadratic to linear in the number of elements in the set. We show that our model is theoretically attractive and we evaluate it on a range of tasks, demonstrating increased performance compared to recent methods for setstructured data.
10/01/2018 ∙ by Juho Lee, et al. ∙ 10 ∙ shareread it

Probabilistic symmetry and invariant neural networks
In an effort to improve the performance of deep neural networks in datascarce, noni.i.d., or unsupervised settings, much recent research has been devoted to encoding invariance under symmetry transformations into neural network architectures. We treat the neural network input and output as random variables, and consider group invariance from the perspective of probabilistic symmetry. Drawing on tools from probability and statistics, we establish a link between functional and probabilistic symmetry, and obtain generative functional representations of joint and conditional probability distributions that are invariant or equivariant under the action of a compact group. Those representations completely characterize the structure of neural networks that can be used to model such distributions and yield a general program for constructing invariant stochastic or deterministic neural networks. We develop the details of the general program for exchangeable sequences and arrays, recovering a number of recent examples as special cases.
01/18/2019 ∙ by Benjamin BloemReddy, et al. ∙ 10 ∙ shareread it
Yee Whye Teh
is this you? claim profile
Professorial Research Fellow (RSIV) of Statistical Machine Learning at University of Oxford