Efficient and Interpretable Additive Gaussian Process Regression and Application to Analysis of Hourly-recorded NO_2 Concentrations in London
This paper focuses on interpretable additive Gaussian process (GP) regression and its efficient implementation for large-scale data with a multi-dimensional grid structure, as commonly encountered in spatio-temporal analysis. A popular and scalable approach in the GP literature for this type of data exploits the Kronecker product structure in the covariance matrix. However, under the existing methodology, its use is limited to covariance functions with a separable product structure, which lacks flexibility in modelling and selecting interaction effects - an important component in many real-life problems. To address these issues, we propose a class of additive GP models constructed by hierarchical ANOVA kernels. Furthermore, we show that how the Kronecker method can be extended to the proposed class of models. Our approach allows for easy identification of interaction effects, straightforward interpretation of both main and interaction effects and efficient implementation for large-scale data. The proposed method is applied to analyse NO2 concentrations during the COVID-19 lockdown in London. Our scalable method enables analysis of hourly-recorded data collected from 59 different stations across the city, providing additional insights to findings from previous research using daily or weekly averaged data.
READ FULL TEXT