Stephen J. Roberts

is this you? claim profile


Professor of Machine Learning & Director, Oxford-Man Institute of Quantitative Finance at University of Oxford since 1999, CSO & co-founder at Mind Foundry since 2016, Project Scientist of the Machine Learning Research Group, Professorial Fellow of Somerville College, former Director of the EPSRC Centre for Doctoral Training in Autonomous, Intelligent Machines and Systems (AIMS) and Director of the Oxford-Man Institute.

  • Asynchronous Batch Bayesian Optimisation with Improved Local Penalisation

    Batch Bayesian optimisation (BO) has been successfully applied to hyperparameter tuning using parallel computing, but it is wasteful of resources: workers that complete jobs ahead of others are left idle. We address this problem by developing an approach, Penalising Locally for Asynchronous Bayesian Optimisation on k workers (PLAyBOOK), for asynchronous parallel BO. We demonstrate empirically the efficacy of PLAyBOOK and its variants on synthetic tasks and a real-world problem. We undertake a comparison between synchronous and asynchronous BO, and show that asynchronous BO often outperforms synchronous batch BO in both wall-clock time and number of function evaluations.

    01/29/2019 ∙ by Ahsan S. Alvi, et al. ∙ 14 share

    read it

  • Bayesian Optimisation over Multiple Continuous and Categorical Inputs

    Efficient optimisation of black-box problems that comprise both continuous and categorical inputs is important, yet poses significant challenges. We propose a new approach, Continuous and Categorical Bayesian Optimisation (CoCaBO), which combines the strengths of multi-armed bandits and Bayesian optimisation to select values for both categorical and continuous inputs. We model this mixed-type space using a Gaussian Process kernel, designed to allow sharing of information across multiple categorical variables, each with multiple possible values; this allows CoCaBO to leverage all available data efficiently. We extend our method to the batch setting and propose an efficient selection procedure that dynamically balances exploration and exploitation whilst encouraging batch diversity. We demonstrate empirically that our method outperforms existing approaches on both synthetic and real-world optimisation tasks with continuous and categorical inputs.

    06/20/2019 ∙ by Binxin Ru, et al. ∙ 5 share

    read it

  • Optimization, fast and slow: optimally switching between local and Bayesian optimization

    We develop the first Bayesian Optimization algorithm, BLOSSOM, which selects between multiple alternative acquisition functions and traditional local optimization at each step. This is combined with a novel stopping condition based on expected regret. This pairing allows us to obtain the best characteristics of both local and Bayesian optimization, making efficient use of function evaluations while yielding superior convergence to the global minimum on a selection of optimization problems, and also halting optimization once a principled and intuitive stopping condition has been fulfilled.

    05/22/2018 ∙ by Mark McLeod, et al. ∙ 2 share

    read it

  • Learning from lions: inferring the utility of agents from their trajectories

    We build a model using Gaussian processes to infer a spatio-temporal vector field from observed agent trajectories. Significant landmarks or influence points in agent surroundings are jointly derived through vector calculus operations that indicate presence of sources and sinks. We evaluate these influence points by using the Kullback-Leibler divergence between the posterior and prior Laplacian of the inferred spatio-temporal vector field. Through locating significant features that influence trajectories, our model aims to give greater insight into underlying causal utility functions that determine agent decision-making. A key feature of our model is that it infers a joint Gaussian process over the observed trajectories, the time-varying vector field of utility and canonical vector calculus operators. We apply our model to both synthetic data and lion GPS data collected at the Bubye Valley Conservancy in southern Zimbabwe.

    09/07/2017 ∙ by Adam D. Cobb, et al. ∙ 0 share

    read it

  • Optimal client recommendation for market makers in illiquid financial products

    The process of liquidity provision in financial markets can result in prolonged exposure to illiquid instruments for market makers. In this case, where a proprietary position is not desired, pro-actively targeting the right client who is likely to be interested can be an effective means to offset this position, rather than relying on commensurate interest arising through natural demand. In this paper, we consider the inference of a client profile for the purpose of corporate bond recommendation, based on typical recorded information available to the market maker. Given a historical record of corporate bond transactions and bond meta-data, we use a topic-modelling analogy to develop a probabilistic technique for compiling a curated list of client recommendations for a particular bond that needs to be traded, ranked by probability of interest. We show that a model based on Latent Dirichlet Allocation offers promising performance to deliver relevant recommendations for sales traders.

    04/27/2017 ∙ by Dieter Hendricks, et al. ∙ 0 share

    read it

  • Distribution of Gaussian Process Arc Lengths

    We present the first treatment of the arc length of the Gaussian Process (GP) with more than a single output dimension. GPs are commonly used for tasks such as trajectory modelling, where path length is a crucial quantity of interest. Previously, only paths in one dimension have been considered, with no theoretical consideration of higher dimensional problems. We fill the gap in the existing literature by deriving the moments of the arc length for a stationary GP with multiple output dimensions. A new method is used to derive the mean of a one-dimensional GP over a finite interval, by considering the distribution of the arc length integrand. This technique is used to derive an approximate distribution over the arc length of a vector valued GP in R^n by moment matching the distribution. Numerical simulations confirm our theoretical derivations.

    03/23/2017 ∙ by Justin D. Bewsher, et al. ∙ 0 share

    read it

  • Practical Bayesian Optimization for Variable Cost Objectives

    We propose a novel Bayesian Optimization approach for black-box functions with an environmental variable whose value determines the tradeoff between evaluation cost and the fidelity of the evaluations. Further, we use a novel approach to sampling support points, allowing faster construction of the acquisition function. This allows us to achieve optimization with lower overheads than previous approaches and is implemented for a more general class of problem. We show this approach to be effective on synthetic and real world benchmark problems.

    03/13/2017 ∙ by Mark McLeod, et al. ∙ 0 share

    read it

  • p-Markov Gaussian Processes for Scalable and Expressive Online Bayesian Nonparametric Time Series Forecasting

    In this paper we introduce a novel online time series forecasting model we refer to as the pM-GP filter. We show that our model is equivalent to Gaussian process regression, with the advantage that both online forecasting and online learning of the hyper-parameters have a constant (rather than cubic) time complexity and a constant (rather than squared) memory requirement in the number of observations, without resorting to approximations. Moreover, the proposed model is expressive in that the family of covariance functions of the implied latent process, namely the spectral Matern kernels, have recently been proven to be capable of approximating arbitrarily well any translation-invariant covariance function. The benefit of our approach compared to competing models is demonstrated using experiments on several real-life datasets.

    10/09/2015 ∙ by Yves-Laurent Kom Samo, et al. ∙ 0 share

    read it

  • A Variational Bayesian State-Space Approach to Online Passive-Aggressive Regression

    Online Passive-Aggressive (PA) learning is a class of online margin-based algorithms suitable for a wide range of real-time prediction tasks, including classification and regression. PA algorithms are formulated in terms of deterministic point-estimation problems governed by a set of user-defined hyperparameters: the approach fails to capture model/prediction uncertainty and makes their performance highly sensitive to hyperparameter configurations. In this paper, we introduce a novel PA learning framework for regression that overcomes the above limitations. We contribute a Bayesian state-space interpretation of PA regression, along with a novel online variational inference scheme, that not only produces probabilistic predictions, but also offers the benefit of automatic hyperparameter tuning. Experiments with various real-world data sets show that our approach performs significantly better than a more standard, linear Gaussian state-space model.

    09/08/2015 ∙ by Arnold Salas, et al. ∙ 0 share

    read it

  • Sampling for Inference in Probabilistic Models with Fast Bayesian Quadrature

    We propose a novel sampling framework for inference in probabilistic models: an active learning approach that converges more quickly (in wall-clock time) than Markov chain Monte Carlo (MCMC) benchmarks. The central challenge in probabilistic inference is numerical integration, to average over ensembles of models or unknown (hyper-)parameters (for example to compute the marginal likelihood or a partition function). MCMC has provided approaches to numerical integration that deliver state-of-the-art inference, but can suffer from sample inefficiency and poor convergence diagnostics. Bayesian quadrature techniques offer a model-based solution to such problems, but their uptake has been hindered by prohibitive computation costs. We introduce a warped model for probabilistic integrands (likelihoods) that are known to be non-negative, permitting a cheap active learning scheme to optimally select sample locations. Our algorithm is demonstrated to offer faster convergence (in seconds) relative to simple Monte Carlo and annealed importance sampling on both synthetic and real-world examples.

    11/03/2014 ∙ by Tom Gunter, et al. ∙ 0 share

    read it

  • Variational Inference for Gaussian Process Modulated Poisson Processes

    We present the first fully variational Bayesian inference scheme for continuous Gaussian-process-modulated Poisson processes. Such point processes are used in a variety of domains, including neuroscience, geo-statistics and astronomy, but their use is hindered by the computational cost of existing inference schemes. Our scheme: requires no discretisation of the domain; scales linearly in the number of observed events; and is many orders of magnitude faster than previous sampling based approaches. The resulting algorithm is shown to outperform standard methods on synthetic examples, coal mining disaster data and in the prediction of Malaria incidences in Kenya.

    11/02/2014 ∙ by Chris Lloyd, et al. ∙ 0 share

    read it