RaJIVE: Robust Angle Based JIVE for Integrating Noisy Multi-Source Data

01/22/2021
by   Erica Ponzi, et al.
0

With increasing availability of high dimensional, multi-source data, the identification of joint and data specific patterns of variability has become a subject of interest in many research areas. Several matrix decomposition methods have been formulated for this purpose, for example JIVE (Joint and Individual Variation Explained), and its angle based variation, aJIVE. Although the effect of data contamination on the estimated joint and individual components has not been considered in the literature, gross errors and outliers in the data can cause instability in such methods, and lead to incorrect estimation of joint and individual variance components. We focus on the aJIVE factorization method and provide a thorough analysis of the effect outliers on the resulting variation decomposition. After showing that such effect is not negligible when all data-sources are contaminated, we propose a robust extension of aJIVE (RaJIVE) that integrates a robust formulation of the singular value decomposition into the aJIVE approach. The proposed RaJIVE is shown to provide correct decompositions even in the presence of outliers and improves the performance of aJIVE. We use extensive simulation studies with different levels of data contamination to compare the two methods. Finally, we describe an application of RaJIVE to a multi-omics breast cancer dataset from The Cancer Genome Atlas. We provide the R package RaJIVE with a ready-to-use implementation of the methods and documentation of code and examples.

READ FULL TEXT

page 15

page 16

page 17

page 18

page 19

page 20

research
07/04/2018

Robust Identification of Target Genes and Outliers in Triple-negative Breast Cancer Data

Correct classification of breast cancer sub-types is of high importance ...
research
02/20/2011

Joint and individual variation explained (JIVE) for integrated analysis of multiple data types

Research in several fields now requires the analysis of data sets in whi...
research
02/26/2021

sJIVE: Supervised Joint and Individual Variation Explained

Analyzing multi-source data, which are multiple views of data on the sam...
research
12/01/2019

Joint and individual analysis of breast cancer histologic images and genomic covariates

A key challenge in modern data analysis is understanding connections bet...
research
04/07/2017

Angle-Based Joint and Individual Variation Explained

Integrative analysis of disparate data blocks measured on a common set o...
research
02/07/2023

Decomposition by Approximation with Pulse Waves Allowing Further Research on Sources of Voltage Fluctuations

Voltage fluctuations are common disturbances in power grids. Initially, ...
research
02/28/2013

Bayesian Consensus Clustering

The task of clustering a set of objects based on multiple sources of dat...

Please sign up or login with your details

Forgot password? Click here to reset