Robust principal components for irregularly spaced longitudinal data

03/26/2018
by   Ricardo A. Maronna, et al.
0

Consider longitudinal data x_ij, with i=1,...,n and j=1,...,p_i, where x_ij is the j-th observation of the random function X_i( .) observed at time t_j. The goal of this paper is to develop a parsimonious representation of the data by a linear combination of a set of q smooth functions H_k( .) (k=1,..,q) in the sense that x_ij≈μ_j+∑_k=1^qβ_kiH_k( t_j) , such that it fulfills three goals: it is resistant to atypical X_i's ('case contamination'), it is resistant to isolated gross errors at some t_ij ('cell contamination'), and it can be applied when some of the x_ij are missing ('irregularly spaced' ---or 'incomplete'-- data). Two approaches will be proposed for this problem. One deals with the three goals stated above, and is based on ideas similar to MM-estimation (Yohai 1987). The other is a simple and fast estimator which can be applied to complete data with case- and cellwise contamination, and is based on applying a standard robust principal components estimate and smoothing the principal directions. Experiments with real and simulated data suggest that with complete data the simple estimator outperforms its competitors, while the MM estimator is competitive for incomplete data. Keywords: Principal components, MM-estimator, longitudinal .data, B-splines, incomplete data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/02/2020

Robust functional principal components for sparse longitudinal data

In this paper we review existing methods for robust functional principal...
research
12/23/2022

Using MM principles to deal with incomplete data in K-means clustering

Among many clustering algorithms, the K-means clustering algorithm is wi...
research
10/10/2022

A Posteriori Error Estimate and Adaptivity for QM/MM Models of Crystalline Defects

Hybrid quantum/molecular mechanics models (QM/MM methods) are widely use...
research
03/15/2018

Asymptotic theory for longitudinal data with missing responses adjusted by inverse probability weights

In this article, we propose a new method for analyzing longitudinal data...
research
06/28/2019

High-dimensional principal component analysis with heterogeneous missingness

We study the problem of high-dimensional Principal Component Analysis (P...
research
02/07/2021

Consequences of Misaligned AI

AI systems often rely on two key components: a specified goal or reward ...
research
10/20/2020

Distributed Learning of Finite Gaussian Mixtures

Advances in information technology have led to extremely large datasets ...

Please sign up or login with your details

Forgot password? Click here to reset