High Performance Multivariate Spatial Modeling for Geostatistical Data on Manycore Systems

08/03/2020
by   Mary Lai O. Salvaña, et al.
0

Modeling and inferring spatial relationships and predicting missing values of environmental data are some of the main tasks of geospatial statisticians. These routine tasks are accomplished using multivariate geospatial models and the cokriging technique. The latter requires the evaluation of the expensive Gaussian log-likelihood function, which has impeded the adoption of multivariate geospatial models for large multivariate spatial datasets. However, this large-scale cokriging challenge provides a fertile ground for supercomputing implementations for the geospatial statistics community as it is paramount to scale computational capability to match the growth in environmental data coming from the widespread use of different data collection technologies. In this paper, we develop and deploy large-scale multivariate spatial modeling and inference on parallel hardware architectures. To tackle the increasing complexity in matrix operations and the massive concurrency in parallel systems, we leverage low-rank matrix approximation techniques with task-based programming models and schedule the asynchronous computational tasks using a dynamic runtime system. The proposed framework provides both the dense and the approximated computations of the Gaussian log-likelihood function. It demonstrates accuracy robustness and performance scalability on a variety of computer systems. Using both synthetic and real datasets, the low-rank matrix approximation shows better performance compared to exact computation, while preserving the application requirements in both parameter estimation and prediction accuracy. We also propose a novel algorithm to assess the prediction accuracy after the online parameter estimation. The algorithm quantifies prediction performance and provides a benchmark for measuring the efficiency and accuracy of several approximation techniques in multivariate spatial modeling.

READ FULL TEXT
research
04/24/2018

Tile Low-Rank Approximation of Large-Scale Maximum Likelihood Estimation on Manycore Architectures

Maximum likelihood estimation is an important statistical technique for ...
research
04/24/2018

Parallel Approximation of the Maximum Likelihood Estimation for the Prediction of Large-Scale Geostatistics Simulations

Maximum likelihood estimation is an important statistical technique for ...
research
07/23/2019

ExaGeoStatR: A Package for Large-Scale Geostatistics in R

Parallel computing in Gaussian process calculation becomes a necessity f...
research
02/26/2016

Multivariate Hawkes Processes for Large-scale Inference

In this paper, we present a framework for fitting multivariate Hawkes pr...
research
09/27/2019

Robust Factor Analysis Parameter Estimation

This paper considers the problem of robustly estimating the parameters o...
research
11/11/2019

Efficiency Assessment of Approximated Spatial Predictions for Large Datasets

Due to the well-known computational showstopper of the exact Maximum Lik...

Please sign up or login with your details

Forgot password? Click here to reset