Fast Gaussian Process Regression for Big Data

09/17/2015
by   Sourish Das, et al.
0

Gaussian Processes are widely used for regression tasks. A known limitation in the application of Gaussian Processes to regression tasks is that the computation of the solution requires performing a matrix inversion. The solution also requires the storage of a large matrix in memory. These factors restrict the application of Gaussian Process regression to small and moderate size data sets. We present an algorithm that combines estimates from models developed using subsets of the data obtained in a manner similar to the bootstrap. The sample size is a critical parameter for this algorithm. Guidelines for reasonable choices of algorithm parameters, based on detailed experimental study, are provided. Various techniques have been proposed to scale Gaussian Processes to large scale regression tasks. The most appropriate choice depends on the problem context. The proposed method is most appropriate for problems where an additive model works well and the response depends on a small number of features. The minimax rate of convergence for such problems is attractive and we can build effective models with a small subset of the data. The Stochastic Variational Gaussian Process and the Sparse Gaussian Process are also appropriate choices for such problems. These methods pick a subset of data based on theoretical considerations. The proposed algorithm uses bagging and random sampling. Results from experiments conducted as part of this study indicate that the algorithm presented in this work can be as effective as these methods. Model stacking can be used to combine the model developed with the proposed method with models from other methods for large scale regression such as Gradient Boosted Trees. This can yield performance gains.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/24/2018

Composite Inference for Gaussian Processes

Large-scale Gaussian process models are becoming increasingly important ...
research
11/27/2020

Knowledge transfer across cell lines using Hybrid Gaussian Process models with entity embedding vectors

To date, a large number of experiments are performed to develop a bioche...
research
04/05/2022

Probabilistic surrogate modeling of offshore wind-turbine loads with chained Gaussian processes

Heteroscedastic Gaussian process regression, based on the concept of cha...
research
10/06/2020

Splitting Gaussian Process Regression for Streaming Data

Gaussian processes offer a flexible kernel method for regression. While ...
research
11/10/2015

Stochastic Expectation Propagation for Large Scale Gaussian Process Classification

A method for large scale Gaussian process classification has been recent...
research
09/16/2017

Forecasting of commercial sales with large scale Gaussian Processes

This paper argues that there has not been enough discussion in the field...
research
10/18/2022

Locally Smoothed Gaussian Process Regression

We develop a novel framework to accelerate Gaussian process regression (...

Please sign up or login with your details

Forgot password? Click here to reset