RCRnorm: An integrated system of random-coefficient hierarchical regression models for normalizing NanoString nCounter data

10/28/2019
by   Gaoxiang Jia, et al.
0

Formalin-fixed paraffin-embedded (FFPE) samples have great potential for biomarker discovery, retrospective studies and diagnosis or prognosis of diseases. Their application, however, is hindered by the unsatisfactory performance of traditional gene expression profiling techniques on damaged RNAs. NanoString nCounter platform is well suited for profiling of FFPE samples and measures gene expression with high sensitivity which may greatly facilitate realization of scientific and clinical values of FFPE samples. However, methodological development for normalization, a critical step when analyzing this type of data, is far behind. Existing methods designed for the platform use information from different types of internal controls separately and rely on an overly-simplified assumption that expression of housekeeping genes is constant across samples for global scaling. Thus, these methods are not optimized for the nCounter system, not mentioning that they were not developed for FFPE samples. We construct an integrated system of random-coefficient hierarchical regression models to capture main patterns and characteristics observed from NanoString data of FFPE samples and develop a Bayesian approach to estimate parameters and normalize gene expression across samples. Our method, labeled RCRnorm, incorporates information from all aspects of the experimental design and simultaneously removes biases from various sources. It eliminates the unrealistic assumption on housekeeping genes and offers great interpretability. Furthermore, it is applicable to freshly frozen or like samples that can be generally viewed as a reduced case of FFPE samples. Simulation and applications showed the superior performance of RCRnorm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/21/2019

A Nonparametric Multi-view Model for Estimating Cell Type-Specific Gene Regulatory Networks

We present a Bayesian hierarchical multi-view mixture model termed Symph...
research
06/06/2013

Verdict Accuracy of Quick Reduct Algorithm using Clustering and Classification Techniques for Gene Expression Data

In most gene expression data, the number of training samples is very sma...
research
08/23/2021

All You Need is Color: Image based Spatial Gene Expression Prediction using Neural Stain Learning

"Is it possible to predict expression levels of different genes at a giv...
research
01/23/2020

A covariance-enhanced approach to multi-tissue joint eQTL mapping with application to transcriptome-wide association studies

Transcriptome-wide association studies based on genetically predicted ge...
research
01/16/2020

Prediction of Discharge Capacity of Labyrinth Weir with Gene Expression Programming

This paper proposes a model based on gene expression programming for pre...
research
10/18/2019

The TCGA Meta-Dataset Clinical Benchmark

Machine learning is bringing a paradigm shift to healthcare by changing ...
research
09/07/2018

Logistic Regression Augmented Community Detection for Network Data with Application in Identifying Autism-Related Gene Pathways

When searching for gene pathways leading to specific disease outcomes, a...

Please sign up or login with your details

Forgot password? Click here to reset