The covariance shift (C-SHIFT) algorithm for normalizing biological data

03/29/2020
by   Evgenia Chunikhina, et al.
0

Omics technologies are powerful tools for analyzing patterns in gene expression data for thousands of genes. Due to a number of systematic variations in experiments, the raw gene expression data is often obfuscated by undesirable technical noises. Various normalization techniques were designed in an attempt to remove these non-biological errors prior to any statistical analysis. One of the reasons for normalizing data is the need for recovering the covariance matrix used in gene network analysis. In this paper, we introduce a novel normalization technique, called the covariance shift (C-SHIFT) method. This normalization algorithm uses optimization techniques together with the blessing of dimensionality philosophy and energy minimization hypothesis for covariance matrix recovery under additive noise (in biology, known as the bias). Thus, it is perfectly suited for the analysis of logarithmic gene expression data. Numerical experiments on synthetic data demonstrate the method's advantage over the classical normalization techniques. Namely, the comparison is made with rank, quantile, cyclic LOESS (locally estimated scatterplot smoothing), and MAD (median absolute deviation) normalization methods.

READ FULL TEXT
research
03/27/2015

Estimating a common covariance matrix for network meta-analysis of gene expression datasets in diffuse large B-cell lymphoma

The estimation of covariance matrices of gene expressions has many appli...
research
01/13/2022

Depth Normalization of Small RNA Sequencing: Using Data and Biology to Select a Suitable Method

Deep sequencing has become one of the most popular tools for transcripto...
research
10/04/2018

A statistical normalization method and differential expression analysis for RNA-seq data between different species

Background: High-throughput techniques bring novel tools but also statis...
research
06/28/2022

Statistical Depth based Normalization and Outlier Detection of Gene Expression Data

Normalization and outlier detection belong to the preprocessing of gene ...
research
06/26/2018

Estimation of large block covariance matrices: Application to the analysis of gene expression data

Motivated by an application in molecular biology, we propose a novel, ef...
research
05/19/2023

Structured factorization for single-cell gene expression data

Single-cell gene expression data are often characterized by large matric...
research
05/19/2022

Spatial Transcriptomics Dimensionality Reduction using Wavelet Bases

Spatially resolved transcriptomics (ST) measures gene expression along w...

Please sign up or login with your details

Forgot password? Click here to reset