Comprehensive survey of computational learning methods for analysis of gene expression data in genomics

02/07/2022
by   Nikita Bhandari, et al.
9

Computational analysis methods including machine learning have a significant impact in the fields of genomics and medicine. High-throughput gene expression analysis methods such as microarray technology and RNA sequencing produce enormous amounts of data. Traditionally, statistical methods are used for comparative analysis of the gene expression data. However, more complex analysis for classification and discovery of feature genes or sample observations requires sophisticated computational approaches. In this review, we compile various statistical and computational tools used in analysis of expression microarray data. Even though, the methods are discussed in the context of expression microarray data, they can also be applied for the analysis of RNA sequencing or quantitative proteomics datasets. We specifically discuss methods for missing value (gene expression) imputation, feature gene scaling, selection and extraction of features for dimensionality reduction, and learning and analysis of expression data. We discuss the types of missing values and the methods and approaches usually employed in their imputation. We also discuss methods of data transformation and feature scaling viz. normalization and standardization. Various approaches used in feature selection and extraction are also reviewed. Lastly, learning and analysis methods including class comparison, class prediction, and class discovery along with their evaluation parameters are described in detail. We have described the process of generation of a microarray gene expression data along with advantages and limitations of the above-mentioned techniques. We believe that this detailed review will help the users to select appropriate methods based on the type of data and the expected outcome.

READ FULL TEXT

page 4

page 7

page 9

page 13

page 15

page 24

page 29

page 30

research
05/04/2023

Fuzzy Gene Selection and Cancer Classification Based on Deep Learning Model

Machine learning (ML) approaches have been used to develop highly accura...
research
06/05/2015

Global Gene Expression Analysis Using Machine Learning Methods

Microarray is a technology to quantitatively monitor the expression of l...
research
09/08/2021

Computational methods for differentially expressed gene analysis from RNA-Seq: an overview

The analysis of differential gene expression from RNA-Seq data has becom...
research
11/03/2021

Multivariate feature ranking of gene expression data

Gene expression datasets are usually of high dimensionality and therefor...
research
02/22/2018

SMAGEXP: a galaxy tool suite for transcriptomics data meta-analysis

Bakground: With the proliferation of available microarray and high throu...
research
08/29/2023

From RNA sequencing measurements to the final results: a practical guide to navigating the choices and uncertainties of gene set analysis

Gene set analysis, a popular approach for analyzing high-throughput gene...

Please sign up or login with your details

Forgot password? Click here to reset