A Common-Factor Approach for Multivariate Data Cleaning with an Application to Mars Phoenix Mission Data

10/05/2015
by   Dongping Fang, et al.
0

Data quality is fundamentally important to ensure the reliability of data for stakeholders to make decisions. In real world applications, such as scientific exploration of extreme environments, it is unrealistic to require raw data collected to be perfect. As data miners, when it is infeasible to physically know the why and the how in order to clean up the data, we propose to seek the intrinsic structure of the signal to identify the common factors of multivariate data. Using our new data driven learning method, the common-factor data cleaning approach, we address an interdisciplinary challenge on multivariate data cleaning when complex external impacts appear to interfere with multiple data measurements. Existing data analyses typically process one signal measurement at a time without considering the associations among all signals. We analyze all signal measurements simultaneously to find the hidden common factors that drive all measurements to vary together, but not as a result of the true data measurements. We use common factors to reduce the variations in the data without changing the base mean level of the data to avoid altering the physical meaning.

READ FULL TEXT

page 2

page 9

page 12

research
07/05/2019

Networkmetrics unraveled: MBDA in Action

We propose networkmetrics, a new data-driven approach for monitoring, tr...
research
10/26/2021

Learning to Pre-process Laser Induced Breakdown Spectroscopy Signals Without Clean Data

This work tests whether deep neural networks can clean laser induced bre...
research
07/16/2018

A latent factor approach for prediction from multiple assays

In many domains such as healthcare or finance, data often come in differ...
research
05/11/2020

Deep Latent Variable Model for Longitudinal Group Factor Analysis

In many scientific problems such as video surveillance, modern genomic a...
research
12/09/2020

Consistently recovering the signal from noisy functional data

In practice most functional data cannot be recorded on a continuum, but ...
research
06/23/2020

Controlling for Unknown Confounders in Neuroimaging

The aim of many studies in biomedicine is to infer cause-effect relation...

Please sign up or login with your details

Forgot password? Click here to reset