Probabilistic Predictive Principal Component Analysis for Spatially-Misaligned and High-Dimensional Air Pollution Data with Missing Observations

05/01/2019
by   Phuong T. Vu, et al.
0

Accurate predictions of pollutant concentrations at new locations are often of interest in air pollution studies on fine particulate matters (PM_2.5), in which data is usually not measured at all study locations. PM_2.5 is also a mixture of many different chemical components. Principal component analysis (PCA) can be incorporated to obtain lower-dimensional representative scores of such multi-pollutant data. Spatial prediction can then be used to estimate these scores at new locations. Recently developed predictive PCA modifies the traditional PCA algorithm to obtain scores with spatial structures that can be well predicted at unmeasured locations. However, these approaches require complete data, whereas multi-pollutant data tends to have complex missing patterns in practice. We propose probabilistic versions of predictive PCA which allow for flexible model-based imputation that can account for spatial information and subsequently improve the overall predictive performance.

READ FULL TEXT

page 12

page 14

research
04/11/2020

Spatial Matrix Completion for Spatially-Misaligned and High-Dimensional Air Pollution Data

In health-pollution cohort studies, accurate predictions of pollutant co...
research
01/23/2021

A Geospatial Functional Model For OCO-2 Data with Application on Imputation and Land Fraction Estimation

Data from NASA's Orbiting Carbon Observatory-2 (OCO-2) satellite is esse...
research
04/05/2021

Generalized Joint Probability Density Function Formulation inTurbulent Combustion using DeepONet

Joint probability density function (PDF)-based models in turbulent combu...
research
07/01/2022

Local manifold learning and its link to domain-based physics knowledge

In many reacting flow systems, the thermo-chemical state-space is known ...
research
07/28/2016

Asymptotic properties of Principal Component Analysis and shrinkage-bias adjustment under the Generalized Spiked Population model

With the development of high-throughput technologies, principal componen...
research
06/28/2019

High-dimensional principal component analysis with heterogeneous missingness

We study the problem of high-dimensional Principal Component Analysis (P...
research
10/05/2019

Recurrent neural network based decision support system

Decision Support Systems (DSS) in complex installations play a crucial r...

Please sign up or login with your details

Forgot password? Click here to reset