Sparse Pseudo-input Local Kriging for Large Non-stationary Spatial Datasets with Exogenous Variables
Gaussian process (GP) regression is a powerful tool for building predictive models for spatial systems. However, it does not scale efficiently for large datasets. Particularly, for high-dimensional spatial datasets, i.e., spatial datasets that contain exogenous variables, the performance of GP regression further deteriorates. This paper presents the Sparse Pseudo-input Local Kriging (SPLK) which approximates the full GP for spatial datasets with exogenous variables. SPLK employs orthogonal cuts which decompose the domain into smaller subdomains and then applies a sparse approximation of the full GP in each subdomain. We obtain the continuity of the global predictor by imposing continuity constraints on the boundaries of the neighboring subdomains. The domain decomposition scheme applies independent covariance structures in each region, and as a result, SPLK captures heterogeneous covariance structures. SPLK achieves computational efficiency by utilizing sparse approximation in each subdomain which enables SPLK to accommodate large subdomains that contain many data points and possess a homogenous covariance structure. We Apply the proposed method to real and simulated datasets. We conclude that the combination of orthogonal cuts and sparse approximation makes the proposed method an efficient algorithm for high-dimensional large spatial datasets.
READ FULL TEXT