From RNA sequencing measurements to the final results: a practical guide to navigating the choices and uncertainties of gene set analysis

08/29/2023
by   Milena Wünsch, et al.
0

Gene set analysis, a popular approach for analyzing high-throughput gene expression data, aims to identify sets of related genes that show significantly enriched or depleted expression patterns between different conditions. In the last years, a multitude of methods and corresponding tools have been developed for this task. However, clear guidance is lacking: choosing the right method is the first hurdle a researcher is confronted with. No less challenging than overcoming this so-called method uncertainty is the procedure of preprocessing, from knowing which steps are required to selecting a corresponding approach from the plethora of valid options to create the accepted input object (data preprocessing uncertainty), with clear guidance again being scarce. Here, we provide a practical guide through all steps required to conduct gene set analysis, beginning with a concise overview of a selection of established methods, including GSEA and DAVID. We thereby lay a special focus on reviewing and explaining the necessary preprocessing steps for each method under consideration (e.g. the necessity of a transformation of the RNA-Seq data)-an essential aspect that is typically paid only limited attention to in both existing reviews and applications. To raise awareness of the spectrum of uncertainties, our review is accompanied by an extensive overview of the literature on valid approaches for each step and illustrative R code demonstrating the complex analysis pipelines. It ends with a discussion and recommendations to both users and developers to ensure that the results of gene set analysis are, despite the above-mentioned uncertainties, replicable and transparent.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/08/2021

Computational methods for differentially expressed gene analysis from RNA-Seq: an overview

The analysis of differential gene expression from RNA-Seq data has becom...
research
02/07/2022

Comprehensive survey of computational learning methods for analysis of gene expression data in genomics

Computational analysis methods including machine learning have a signifi...
research
12/07/2010

Argudas: arguing with gene expression information

In situ hybridisation gene expression information helps biologists ident...
research
06/29/2015

Integrative analysis of gene expression and phenotype data

The linking genotype to phenotype is the fundamental aim of modern genet...
research
10/05/2020

Factorized linear discriminant analysis for phenotype-guided representation learning of neuronal gene expression data

A central goal in neurobiology is to relate the expression of genes to t...
research
05/21/2023

Gene Set Summarization using Large Language Models

Molecular biologists frequently interpret gene lists derived from high-t...
research
04/27/2023

Guidance note on best statistical practices for TOAR analyses

The aim of this guidance note is to provide recommendations on best stat...

Please sign up or login with your details

Forgot password? Click here to reset