Reducing the Variance of Gaussian Process Hyperparameter Optimization with Preconditioning

07/01/2021
by   Jonathan Wenger, et al.
7

Gaussian processes remain popular as a flexible and expressive model class, but the computational cost of kernel hyperparameter optimization stands as a major limiting factor to their scaling and broader adoption. Recent work has made great strides combining stochastic estimation with iterative numerical techniques, essentially boiling down GP inference to the cost of (many) matrix-vector multiplies. Preconditioning – a highly effective step for any iterative method involving matrix-vector multiplication – can be used to accelerate convergence and thus reduce bias in hyperparameter optimization. Here, we prove that preconditioning has an additional benefit that has been previously unexplored. It not only reduces the bias of the log-marginal likelihood estimator and its derivatives, but it also simultaneously can reduce variance at essentially negligible cost. We leverage this result to derive sample-efficient algorithms for GP hyperparameter optimization requiring as few as 𝒪(log(ε^-1)) instead of 𝒪(ε^-2) samples to achieve error ε. Our theoretical results enable provably efficient and scalable optimization of kernel hyperparameters, which we validate empirically on a set of large-scale benchmark problems. There, variance reduction via preconditioning results in an order of magnitude speedup in hyperparameter optimization of exact GPs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/22/2022

Scalable Gaussian Process Hyperparameter Optimization via Coverage Regularization

Gaussian processes (GPs) are Bayesian non-parametric models popular in a...
research
09/28/2018

GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration

Despite advances in scalable models, the inference tools used for Gaussi...
research
06/20/2022

Noise Estimation in Gaussian Process Regression

We develop a computational procedure to estimate the covariance hyperpar...
research
11/19/2021

Gaussian Process Inference Using Mini-batch Stochastic Gradient Descent: Convergence Guarantees and Empirical Benefits

Stochastic gradient descent (SGD) and its variants have established them...
research
06/21/2022

Sparse Kernel Gaussian Processes through Iterative Charted Refinement (ICR)

Gaussian Processes (GPs) are highly expressive, probabilistic models. A ...
research
12/31/2021

When are Iterative Gaussian Processes Reliably Accurate?

While recent work on conjugate gradient methods and Lanczos decompositio...
research
02/12/2021

Bias-Free Scalable Gaussian Processes via Randomized Truncations

Scalable Gaussian Process methods are computationally attractive, yet in...

Please sign up or login with your details

Forgot password? Click here to reset