When Gaussian Process Meets Big Data: A Review of Scalable GPs

by   Haitao Liu, et al.

The vast quantity of information brought by big data as well as the evolving computer hardware encourages success stories in the machine learning community. In the meanwhile, it poses challenges for the Gaussian process (GP), a well-known non-parametric and interpretable Bayesian model, which suffers from cubic complexity to training size. To improve the scalability while retaining the desirable prediction quality, a variety of scalable GPs have been presented. But they have not yet been comprehensively reviewed and discussed in a unifying way in order to be well understood by both academia and industry. To this end, this paper devotes to reviewing state-of-the-art scalable GPs involving two main categories: global approximations which distillate the entire data and local approximations which divide the data for subspace learning. Particularly, for global approximations, we mainly focus on sparse approximations comprising prior approximations which modify the prior but perform exact inference, and posterior approximations which retain exact prior but perform approximate inference; for local approximations, we highlight the mixture/product of experts that conducts model averaging from multiple local experts to boost predictions. To present a complete review, recent advances for improving the scalability and model capability of scalable GPs are reviewed. Finally, the extensions and open issues regarding the implementation of scalable GPs in various scenarios are reviewed and discussed to inspire novel ideas for future research avenues.


page 1

page 2

page 3

page 4


Understanding and Comparing Scalable Gaussian Process Regression for Big Data

As a non-parametric Bayesian model which produces informative predictive...

Learning Deep Mixtures of Gaussian Process Experts Using Sum-Product Networks

While Gaussian processes (GPs) are the method of choice for regression t...

Precision Aggregated Local Models

Large scale Gaussian process (GP) regression is infeasible for larger da...

Large-scale Heteroscedastic Regression via Gaussian Process

Heteroscedastic regression which considers varying noises across input d...

Parallel Gaussian Process Regression with Low-Rank Covariance Matrix Approximations

Gaussian processes (GP) are Bayesian non-parametric models that are wide...

Deep Structured Mixtures of Gaussian Processes

Gaussian Processes (GPs) are powerful non-parametric Bayesian regression...

Fast Gaussian Process Predictions on Large Geospatial Fields with Prediction-Point Dependent Basis Functions

In order to perform GP predictions fast in large geospatial fields with ...

Please sign up or login with your details

Forgot password? Click here to reset