Scalable First-Order Bayesian Optimization via Structured Automatic Differentiation

06/16/2022
by   Sebastian Ament, et al.
6

Bayesian Optimization (BO) has shown great promise for the global optimization of functions that are expensive to evaluate, but despite many successes, standard approaches can struggle in high dimensions. To improve the performance of BO, prior work suggested incorporating gradient information into a Gaussian process surrogate of the objective, giving rise to kernel matrices of size nd × nd for n observations in d dimensions. Naïvely multiplying with (resp. inverting) these matrices requires 𝒪(n^2d^2) (resp. 𝒪(n^3d^3)) operations, which becomes infeasible for moderate dimensions and sample sizes. Here, we observe that a wide range of kernels gives rise to structured matrices, enabling an exact 𝒪(n^2d) matrix-vector multiply for gradient observations and 𝒪(n^2d^2) for Hessian observations. Beyond canonical kernel classes, we derive a programmatic approach to leveraging this type of structure for transformations and combinations of the discussed kernel classes, which constitutes a structure-aware automatic differentiation algorithm. Our methods apply to virtually all canonical kernels and automatically extend to complex kernels, like the neural network, radial basis function network, and spectral mixture kernels without any additional derivations, enabling flexible, problem-dependent modeling while scaling first-order BO to high d.

READ FULL TEXT

page 6

page 7

page 8

research
04/17/2023

Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization

The linearized-Laplace approximation (LLA) has been shown to be effectiv...
research
06/25/2019

Spectral Properties of Radial Kernels and Clustering in High Dimensions

In this paper, we study the spectrum and the eigenvectors of radial kern...
research
11/02/2021

Geometry-aware Bayesian Optimization in Robotics using Riemannian Matérn Kernels

Bayesian optimization is a data-efficient technique which can be used fo...
research
02/15/2021

High-Dimensional Gaussian Process Inference with Derivatives

Although it is widely known that Gaussian processes can be conditioned o...
research
06/14/2021

Marginalising over Stationary Kernels with Bayesian Quadrature

Marginalising over families of Gaussian Process kernels produces flexibl...
research
06/12/2018

Differentiable Compositional Kernel Learning for Gaussian Processes

The generalization properties of Gaussian processes depend heavily on th...
research
02/25/2021

Mixed Variable Bayesian Optimization with Frequency Modulated Kernels

The sample efficiency of Bayesian optimization(BO) is often boosted by G...

Please sign up or login with your details

Forgot password? Click here to reset